The use of dependent types trivially allows for the type-checking of systems without syntax directed rules, or even an obvious global analysis. The use of explicitly tracked constraints as in the types ofL terms to aid in this process allows for the separation of program implementation and proof development in a convenient way. This allows for skilled developers without experience in formalized theorem proving to make assumptions and naturally defer the proof of validity to a machine via a tactic or another member of their team. Two primary areas of further investigation immediately emerge
1. An exploration of expanded programming features.
2. Developing a correct-by-construction proof of compilation correctness. As discussed above the current implementation has hard-coded type support limited to lists and integers. A natural extensions of this idea is to introduce a more expressive universe such as the polynomial functors or container types.
6.2.1 Expansion of programming features
The language L proves the feasibility of interactively proving properties about programs using semantic indexing in the context of dependent types. However the current implementation has hard-coded type support limited to lists and integers. As discussed in the limitations, expanding the scope to a general class of types is an obvious target of exploration.
Extensions to new type systems
Although the current usage of constraint reflection for proving the type correctness of expressions is restricted to working with simple types, there
is no fundamental reason this should be the case. The inclusion of lists in
L exemplifies the flexibility of the approach in handling structured data. Thus an interesting area of work would be the application to the variant of System Fω described by Morrisset [Mor06] capturing SML modules and Haskell typeclasses.
Closely related is the system of LXres. In particular LXres likeFωuses its sophisticated kind system to calculate types, restricting the possible shape of runtime values. Crary and Weirich note that this places a requirement on the programmer to carefully represent types in a way which is “not too abstract”. In contrast using a constraint based approach would allow for a more direct encoding of these restrictions at the expense of decidable type-checking.
This use of constraints naturally begs the question of its application in formalizing the correctness of constraint based systems such as FRGN and λrgnU L of Fluet et al. [FMA06] and the closely related language Rust. These systems allow safe, low-level access to memory by the use of region and lifetime systems. Regions represent a slice of memory, whereas lifetimes represent the scope of a region’s use. Moreover these types can be coerced e.g. if a region is expanded then a notion of subtyping applies, and these two regions likely have related lifetimes. The approach of using constraints to prove the validity of coercions in L should adapt to support correct by construction notions of subtyping.
While all of the above are relatively expressive languages, it would be interesting to apply the strategies developed in this thesis to an EDSL such as Feldspar. In particular targeting areas such as hard real-time systems which often require strong guarantees but use relatively simple programming constructs might introduce the possibility of more automated rewrite tactics. As an example inL because index expressions themselves are first order, an index expression reflection procedure could be developed without the need for explicit first-class reflection support. Index expressions can be automatically reflected into an underlying monoid expression consumable by the monoid solver. Hard real-time systems could also provide an opportunity to expand the scope of the costing framework developed in chapter 5 on page 136.
Probabilistic cost analysis
One of the interesting aspects of the cost analysis developed in chapter 5 is the use of an opaque costing function at statement leaves. Such a system is amenable to a probabilistic analysis, loosely coupled with the underlying ISA. Rather than assigning an identifiable, discrete cost to atomic expressions, instead they could be drawn from some density function. While this abandons any certainty that might be required by a soft or hard real- time system, surprisingly the introduced error can be bounded using well understood statistical techniques. If the tail of the cost’s density function is assumed to decay rapidly enough (e.g. sub-gaussian) then estimates of the underlying value can be made to deviates by at most with confidence δ.
This fact follows from an application of Hoeffding’s inequality and could be included directly in the cost trace. For instance a trace could provide some likely cost along with a proof that this value deviates from some “true cost” by with likelihood no more than δ. This is possible by treating accuracy
and confidence like a resource. At the top-level some finite amount of accuracy and confidence are available which then are split between sub-trees and sprinkled into the leaves as required. Such a cost guarantee is given with respect to the cost of executing on the machine which gives the “true cost”. In our case this could be given using the abstract machine defined in section 5.3 on page 158. If the compilation scheme were shown to be sound the system could provide a strong guarantee of correctness in both a program’s functional and non-functional aspects.
Given these additional proofs of correctness, the only remaining piece of the compilation stack which is not known to be sound is the final compilation process from statements to abstract machine.
6.2.2 Proof of sound compilation process
While the current compilation process gives a strong guarantee of safety with respect to type-preservation it lacks two further important properties: correct-by-construction index expression erasure, and proof of soundness. Solving both of these problems should be tractable.
In order to prove the safety of index expression erasure, rather than compiling directly from S a new intermediate form should be introduced without support for explicit coercions. This should not be a problem since currently the only occurrence of rewrites in a statement resulting from a
decomposed expression are those introduced by case expressions. In fact, rewrites are nearly erased by the current decomposition scheme since all existing rewrites are composed and pushed to a single, top-level rewrite. The primary issue identified in section 5.2 on page 146 is the failure to eliminate the translated index expressions of decomposed expressions from the retained constraint context.
Intuitively the issue of stale index expressions in the constraint context should be solvable because at each step of rewriting only a single index expression is transformed. Thus rather than producing a statement well- defined in the input constraint context Ξ, some new constraint context Ξ0 is
produced. The soundness of such an approach should follow in the same style as the rewritten index expression j. Since only one index expression is
ever transformed, a stack of proofs transporting between identified index expressions could be used to identify constraint contexts. Intuitively this stack encodes the sequence of rewrites necessary to perform at each level of the input constraint context to arrive at the output context. This would also give us a rewrite minimality law stating that
Hypothesis 6.2.0.1 (Rewrite minimality). Every expression of index ex- pression i with rewrites in Ξcan be decomposed into an expression of index
expression j without rewrites such that ∃Ξ0.i≈Ξ0 j∧Ξ≈Ξ0
In fact, the author has made great progress in mechanizing this proof; however due to efficiency problems in the current implementation, attempting to complete the proof results in an explosion of memory consumption.
Given a notion of terms without rewrites the validity of their erasure is obvious without the need for an auxiliary proof that evaluation agrees on terms with and without index expressions. Compilation from a language without rewrites would then only need to be index expression respecting in order to be constructively sound. That is to say internalizing soundness requires the semantic index of the source be maintained. Since rewrites are no longer available, the constraint context no longer needs to be tracked and thus the only non-trivial cases are the looping constructs: list and nat iteration. As discussed in section 5.3.2 on page 163 a possible solution is to generate accessibility proofs from the obviously terminating folds. This fixes not only the termination issue in the evaluation function but additionally the issue stemming from the potential mis-application of GetConsPos since
The developments in this thesis using well-understood dependently typed paradigms have worked towards a sound compilation stack for a simple lan- guage. Despite great strides in theorem proving technology, mechanizing correctness and experimenting with type-systems as in the POPLmark chal- lenge [Ayd+05] still requires expertise in using theorem provers. We think the use of constraints and explicit coercions form an interesting basis for mechanizing type-systems with non-trivial reduction rules, allowing skilled programmers to work as usual and deferring necessary proofs to custom tailored tactics or other members of the team. Still, more work is required to explore a broader range of constraints—such as linear inequalities— and to sufficiently automate common rewrites such as the usual style of reduction in dependently typed languages. We would like to eventually work towards a system which allows for a mixture of completely auto- mated rewrites by computing over the constraints in context, and towards strategies which defer difficult to solve proofs e.g. as is done by Coq’s
Program Definition. Such a system could produce a term which proves
some proposition given proof of several constraints it describes as pure data (cs : Constraints ctx) -> (EvalConstraints cs -> Dec p).
Appendix A
Auxiliary correctness proofs of
L
programs
A.1 Correctness of
ixSndixSndCorrect : (xs : List Nat) -> ixSndEval xs = listSnd xs ixSndCorrect (x :: y :: _) = Refl
ixSndCorrect (x :: _) = Refl ixSndCorrect [] = Refl