Arvind
Computer Science and Artificial Intelligence Laboratory M.I.T.
September 26, 2006 http://www.csg.csail.mit.edu/6.827
Type Inference
September 26, 2006
L06-2
Type Inference
• Type inference is typically presented in two different forms:
– Type inference rules: Rules define the type of each expression
• Clean and concise; needed to study the semantic properties, i.e., soundeness, of the type system – Type inference algorithm: Needed by the compiler
writer to deduce the type of each subexpression or to deduce that the expression is ill typed.
• Often it is nontrivial to derive an inference
algorithm for a given set of rules. There can be
many different algorithms for a set of typing rules.
L06-3
first ...
A Simple Type System
(no polymorphism)
L06-4
-calculus with Constants & Letrec
E ::= x | x.E | E E
| Cond (E, E, E)
| PF
k(E
1,...,E
k)
| CN
0| CN
k(E
1,...,E
k) | CN
k(SE
1,...,SE
k)
| let S in E
PF
1::= negate | not | ... | Prj
1| Prj
2| ...
PF
2::= + | ...
CN
0::= Number | Boolean CN
2::= cons | ...
Statements
S ::= | x = E | S; S
not in initial terms
There are no types in the syntax of the language!
L06-5
A Simple Type System
Types
::= base types
| t type variables
|
->
2Function types
Type Environments
TE ::= Identifiers Types
int, bool, ...
L06-6
Simple Type Inference Rules
Typing: TE ├ e :
(App) TE ├ e
1: ->
’TE ├ e
2:
TE ├ (e
1e
2) :
’(Abs)
TE ├ x.e : ->
’(Var)
TE ├ x :
(Const)
TE ├ c :
(Let)
TE ├ (let x = e
1in e
2) :
’TE + {x : } ├ e :
’(x : )TE
typeof(c) =
TE+{x : } ├ e
1: TE+{x : } ├ e
2:
’L06-7
Simple Type Inference Rules cont
(Cond)
TE ├ Cond(e
B, e
1, e
2) : (PF)
TE ├ (e
1+ e
2) : int (CN)
TE ├ CN(e
1, e
2) : CN(
(Pro)
TE ├ Prj
1(e) :
TE ├ e
B: bool TE ├ e
1: TE ├ e
2:
TE ├ e
1: int TE ├ e
2: int
TE ├ e
1:
TE ├ e
2:
TE ├ e : CN(
)
one such
rule for
each PF
one set
of such
rules for
each data
structure
type
L06-8
Simple Inference Algorithm
W(TE, e) returns (S,) such that S (TE) ├ e :
The type environment TE records the most general type of each identifier while the substitution S
records the changes in the type variables
Def W(TE, e) = Case e of
x = ...
x.e = ...
(e
1e
2) = ...
let x = e
1in e
2= ...
L06-9
Simple Inference Algorithm (cont-1)
Def W(TE, e) = Case e of
c = ({}, Typeof(c))
x = if (x Dom(TE)) then Fail else let = TE(x);
in ({}, )
x.e = let (S
1,
1) = W(TE + { x : u }, e);
in (S
1, S
1(u) ->
1) (e
1e
2) =
let x = e
1in e
2=
u’s
represent new type variables
let (S
1,
1) = W(TE, e
1);
(S
2,
2) = W(S
1(TE), e
2);
S
3= Unify(S
2(
1),
2-> u);
in (S
3S
2S
1, S
3(u))
let (S
1,
1) = W(TE + {x : u}, e
1);
S
2= Unify(S
1(u),
1);
(S
3,
2) = W(S
2S
1(TE) + {x :
}, e
2);
in (S
3S
2S
1,
2)
September 26, 2006 http://www.csg.csail.mit.edu/6.827 L06-10
Simple Inference Algorithm (cont-1)
Cond(e
B,e
1, e
2) =
(e
1+ e
2) =
CN(e
1, e
2) =
Prj
1(e) =
let (S
1,
1) = W(TE, e
1);
(S
2,
2) = W(S
1(TE), e
2);
S
3= Unify(
1, int); S
4= Unify(
2, int);
in (S
4S
3S
2S
1, int)
let (S
1,
1) = W(TE, e
1);
(S
2,
2) = W(S
1(TE), e
2);
in (S
2S
1, CN(
1,
2)) let (S, ) = W(TE, e);
S
1= Unify(, CN(u1,u2)));
in (S S, S (u1))
These should probably be type
constants let (S
B,
B) = W(TE, e
B);
(S
1,
1) = W(S
B(TE), e
1);
(S
2,
2) = W(S
1S
B(TE), e
2);
S
3= Unify(
B, bool); S
4= Unify(
1,
2);
in (S
4S
3S
2S
1S
B, S
4S
3
2)
L06-11
Type Inference : Example
let f = n.cond ((eq0 n), 1, n*(f (pred n)) in f
W( , A) =
W({f : u
1}, n.B) =
( [Int / u
2] , Bool ) W({f : u
1, n : u
2}, (eq0 n) ) =
W({f : u
1, n : u
2}, B) = ( [ ] , Int -> Int )
Unify(u
2, Int) =
A B
[ Int / u
2]
W({f : u
1, n : Int}, 1) = ( [ ] , Int) W({f : u
1, n : Int}, n*(f (pred n))) =
W({f : u
1, n : Int}, n) = ( [ ] , Int)
( [Int -> Int / u
1] , Int)
W({f : Int -> Int}, f ) = ( [ ] , Int -> Int)
([Int -> Int / u
1, Int / u
2] , Int) ([Int -> Int / u
1] , Int -> Int)
W({f : u
1, n : Int}, f (pred n)) = ( [Int -> u
3/ u
1] , u
3) Unify(u
1, Int -> u
3) = [ Int -> u
3/ u
1]
Unify(Int -> Int -> Int , Int -> u
3-> Int) = [ Int / u
3]
L06-12
Soundness
• The proposed type system is said to be sound if e : then e indeed evaluates to a value in
• A method of proving soundness:
– The semantics of the language is defined in terms of a value space that has integer
values, Boolean values etc. as subspaces.
– Any expression that may evaluate to a special value “wrong” will not type check.
– There is no type expression that denotes the subspace “wrong”.
Such proofs tend to be difficult ...
L06-13
Some observations
• A type system restricts the class of programs that are considered “legal”
• It is possible a term in the untyped –
calculus may not cause any errors but may not be typeable in a particular type system
let
id = x. x
in ... (id True) ... (id 1) ...
This term is not typeable in the simple type
system we have discussed so far.
L06-14
Polymorphic Types
Constraints:
let
id = x. x
in ... (id True) ... (id 1) ...
id :: t
1--> t
1id :: Int --> t
2id :: Bool --> t
3Does not unify!!
id :: t
1. t
1--> t
1Different uses of a generalized type variable may be instantiated differently
id
2: Bool --> Bool id
1: Int --> Int
Solution: Generalize the type variable
When can we
generalize?
L06-15
now...
The Hindley-Milner Type System
L06-16
A mini Language
to study Hindley-Milner Types
• There are no types in the syntax of the language!
• The type of each subexpression is derived by the Hindley-Milner type inference algorithm.
Expressions
E ::= c constant
| x variable
| x. E abstraction
| (E
1E
2) application
| let x = E
1in E
2let-block
L06-17
A Formal Type System
Note, all the ’s occur in the beginning of a type scheme, i.e., a type cannot contain a type scheme
Types
::= base types
| t type variables
|
->
2Function types
Type Schemes
::=
|t.
Type Environments
TE ::= Identifiers Type Schemes
L06-18
Instantiations
• Type scheme can be instantiated into a type ’ by substituting types for the bound variables of , i.e.,
’ = S for some S s.t. Dom(S) BV() - ’ is said to be an instance of (>’)
- ’ is said to be a generic instance of when S maps variables to new variables.
=t
1...t
n.
Example:
=t
1. t
1-> t
2t
3-> t
2is a generic instance of
Int -> t
2is a non generic instance of
L06-19
Generalization aka Closing
• Generalization introduces polymorphism
• Quantify type variables that are free in but not free in the type environment (TE)
• Captures the notion of new type variables of
Gen(TE,) = t
1...t
n.
where { t
1...t
n} = FV() - FV(TE)
L06-20
HM Type Inference Rules
Typing: TE ├ e :
(App) TE ├ e
1: ->
’TE ├ e
2:
TE ├ (e
1e
2) :
’(Abs)
TE ├ x.e : ->
’(Var)
TE ├ x :
(Const)
TE ├ c :
(Let)
TE ├ (let x = e
1in e
2) :
’TE + {x : } ├ e :
’(x : )TE
typeof(c)
TE+{x : } ├ e
1: TE+{x:Gen(TE,)} ├ e
2:
’L06-21
HM Inference Algorithm
Def W(TE, e) = Case e of
c = ({}, Typeof(c)) x =
x.e =
(e
1e
2) =
let x = e
1in e
2=
u’s
represent new type variables
let (S
1,
1) = W(TE, e
1);
(S
2,
2) = W(S
1(TE), e
2);
S
3= Unify(S
2(
1),
2-> u);
in (S
3S
2S
1, S
3(u))
if (x Dom(TE)) then Fail else let t
1...t
n.= TE(x);
in ( { }, [u
i/ t
i] )
let (S
1,
1) = W(TE + { x : u }, e);
in (S
1, S
1(u) ->
1)
let (S
1,
1) = W(TE + {x : u}, e
1);
S
2= Unify(S
1(u),
1);
= Gen(S
2S
1(TE), S
2(
1) );
(S
3,
2) = W(S
2S
1(TE) + {x : }, e
2);
in (S
3S
2S
1,
2)
L06-22
Hindley-Milner: Example
x. let f = y.x
in (f 1, f True)
W( , A) =
W({x : u
1}, B) =
( [ ] , u
1) ( [ ] , u
3-> u
1)
u
3.u
3-> u
1TE = {x : u
1, f : u
3.u
3-> u
1}
( [ ] , u
4-> u
1) W(TE, 1 ) = ( [ ] , Int )
[ Int / u
4, u
1/ u
5] ( [ ] , u
1)
( [ ] , (u
1,u
1) )
Unify(u
4-> u
1, Int -> u
5) = W({x : u
1, f : u
2, y : u
3}, x ) = W({x : u
1, f : u
2}, y.x ) =
Gen({x : u
1}, u
3-> u
1) = W(TE, (f 1) ) =
( [ ] , u
1-> (u
1,u
1) )
W(TE, f ) =
Unify(u
2, u
3-> u
1) =
B A
[ (u
3-> u
1) / u
2]
...
L06-23
Important Observations
• Do not generalize over type variables used elsewhere
• Let is the only way of defining polymorphic constructs
• Generalize the types of let-bound identifiers
only after processing their definitions
L06-24
Properties of HM Type Inference
• It is sound with respect to the type system.
An inferred type is verifiable.
• It generates most general types of expressions.
Any verifiable type is inferred.
• Complexity
PSPACE-Hard
DEXPTIME-Complete
Nested let blocks
L06-25
Extensions
• Type Declarations
Sanity check; can relax restrictions
• Incremental Type checking
The whole program is not given at the same time, sound inferencing when types of some functions are not known
• Typing references to mutable objects
Hindley-Milner system is unsound for a language with refs (mutable locations)
• Overloading Resolution
L06-26
HM Limitations:
-bound vs Let-bound Variables
Only let-bound identifiers can be instantiated differently.
let
twice f x = f (f x) in twice twice succ 4
vs.
let
twice f x = f (f x) foo g = (g g succ) 4 in foo twice
foo is not type correct !
Generic vs. Non-generic type variables
L06-27
Puzzle: Another set of Inference rules
(Gen) TE ├ e : t FV(TE) TE ├ e : t.
(Spec) TE ├ e : t.
TE ├ e : [t’/t]
(Var) (x : ) TE TE ├ x :
(Let) TE+{x:} ├ e
1: TE+{x:} ├ e
2:
’TE ├ (let x = e
1in e
2) :
’(App) and (Abs) rules remain unchanged.
Sound but no
inference
algorithm !
L06-28