Ascription and Casting - Benjamin C Pierce Types and Programming Languages The MIT Press(2002)

The ascription operator t as T was introduced in §11.4 as a form of checked documentation, allowing the programmer to record in the text of the program the assertion that some subterm of a complex expression has some particular type. In the examples in this book, ascription is also used to control the way in which types are printed, forcing the

typechecker to use a more readable abbreviated form instead of the type that it has actually calculated for a term. In languages with subtyping such as Java and C++, ascription becomes quite a bit more interesting. It is often called

casting in these languages, and is written (T)t. There are actually two quite different forms of casting-so-called up-casts and down-casts. The former are straightforward; the latter, which involve dynamic type-testing, require a significant extension.

Up-casts, in which a term is ascribed a supertype of the type that the typechecker would naturally assign it, are instances of the standard ascription operator. We give a term t and a type T at which we intend to "view" t. The typechecker verifies that T is indeed one of the types of t by attempting to build a derivation

using the "natural" typing of t, the subsumption rule T-SUB, and the ascription rule from §11.4:

Up-casts can be viewed as a form of abstraction-a way of hiding the existence of some parts of a value so that they cannot be used in some surrounding context. For example, if t is a record (or, more generally, an object), then we can use an up-cast to hide some of its fields (methods).

A down-cast, on the other hand, allows us to assign types to terms that the typechecker cannot derive statically. To allow down-casts, we make a somewhat surprising change to the typing rule for as:

That is, we check that t1 is well typed (i.e., that it has some type S) and then assign it type T, without making any demand about the relation between S and T. For example, using down-casting we can write a function f that takes any argument whatsoever, casts it down to a record with an a field containing a number, and returns this number:

In effect, the programmer is saying to the typechecker, "I know (for reasons that are too complex to explain in terms of the typing rules) that f will always be applied to record arguments with numeric a fields; I want you to trust me on this one."

Of course, blindly trusting such assertions will have a disastrous effect on the safety of our language: if the programmer somehow makes a mistake and applies f to a record that does not contain an a field, the results might (depending on the details of the compiler) be completely arbitrary! Instead, our motto should be "trust, but verify." At compile time, the typechecker simply accepts the type given in the down-cast. However, it inserts a check that, at run time, will verify that the actual value does indeed have the type claimed. In other words, the evaluation rule for ascriptions should not just discard the annotation, as our original evaluation rule for ascriptions did,

but should first compare the actual (run-time) type of the value with the declared type:

For example, if we apply the function f above to the argument {a=5,b=true}, then this rule will check (successfully) that ? {a=5,b=true} : {a:Nat}. On the other hand, if we apply f to {b=true}, then the E-DOWNCAST rule will not apply and evaluation will get stuck at this point. This run-time check recovers the type preservation property.

15.5.1 Exercise [?? ?]

Prove this.

Of course, we lose progress, since a well-typed program can certainly get stuck by attempting to evaluate a bad down-cast. Languages that provide down-casts normally address this in one of two ways: either by making a failed down-cast raise a dynamic exception that can be caught and handled by the program (cf. Chapter 14) or else by replacing the down-cast operator by a form of dynamic type test:

Uses of down-casts are actually quite common in languages like Java. In particular, down-casts support a kind of "poor-man's polymorphism." For example, "collection classes" such as Set and List are monomorphic in Java: instead of providing a type List T (lists containing elements of type T) for every type T, Java provides just List, the type of lists whose elements belong to the maximal type Object. Since Object is a supertype of every other type of objects in Java, this means that lists may actually contain anything at all: when we want to add an element to a list, we simply use subsumption to promote its type to Object. However, when we take an element out of a list, all the typechecker knows about it is that it has type Object. This type does not warrant calling most of the methods of the object, since the type Object mentions only a few very generic methods for printing and such, which are shared by all Java objects. In order to do anything useful with it, we must first downcast it to some expected type T.

It has been argued-for example, by the designers of Pizza (Odersky and Wadler, 1997), GJ (Bracha, Odersky, Stoutamire, and Wadler, 1998), PolyJ (Myers, Bank, and Liskov, 1997), and NextGen (Cartwright and Steele, 1998)-that it is better to extend the Java type system with real polymorphism (cf. Chapter 23), which is both safer and more efficient than the down-cast idiom, requiring no run-time tests. On the other hand, such extensions add significant complexity to an already-large language, interacting with many other features of the language and type

there is no way that the typechecker can statically predict the shape of the class that will be loaded at this point (the bytecode file can be obtained on demand from across the net, for example), so the best it can do is to assign the maximal type Object to the newly created instance. Again, in order to do anything useful, we must downcast the new object to some expected type T, handle the run-time exception that may result if the class provided by the bytecode file does not actually match this type, and then go ahead and use it with type T.

To close the discussion of down-casts, a note about implementation is in order. It seems, from the rules we have given, that including down-casts to a language involves adding all the machinery for typechecking to the run-time system. Worse, since values are typically represented differently at run time than inside the compiler (in particular, functions are compiled into byte-codes or native machine instructions), it appears that we will need to write a different typechecker for calculating the types needed in dynamic checks. To avoid this, real languages combine down-casts with type tags-single-word tags (similar in some ways to ML's datatype constructors and the variant tags in §11.10) that capture a run-time "residue" of compile-time types and that are sufficient to perform dynamic subtype tests. Chapter 19 develops one instance of this mechanism in detail.

Variants

The subtyping rules for variants (cf. §11.10) are nearly identical to the ones for records; the only difference is that the width rule S-VARIANTWIDTH allows new variants to be added, not dropped, when moving from a subtype to a supertype. The intuition is that a tagged expression <l=t> belongs to a variant type <li:TiiÎ1..n> if its label l is one of the possible labels {Ti} listed in the type; adding more labels to this set decreases the information it gives us about its elements. A singleton variant type <l1:T1> tells us precisely what label its elements are tagged with; a two-variant type <l1: T1, l2:T2> tells us that its elements have either label l1 or label l2, etc. Conversely, when we use variant values, it is always in the context of a case statement, which must have one branch for each variant listed by the type-listing more variants just means forcing case statements to include some unnecessary extra branches.

Figure 15-5: Variants and Subtyping

Another consequence of combining subtyping and variants is that we can drop the annotation from the tagging construct, writing just <l=t> instead of <l=t> as <li:TiiÎ1..n>, as we did in §11.10, and changing the typing rule for tagging so that it assigns <l1=t1> the precise type <l1:T1>. We can then use subsumption plus S-VARIANTWIDTH to obtain any larger variant type.

Lists

We have seen a number of examples of covariant type constructors (records and variants, as well as function types, on their right-hand sides) and one contravariant constructor (arrow, on the left-hand side). The List constructor is also covariant: if we have a list whose elements have type S1, and S1<: T1, then we can safely regard our list as having elements of type T1.

In document Benjamin C Pierce Types and Programming Languages The MIT Press(2002) pdf (Page 187-190)