SML# lexical items consist of the following.
keywords The following are SML# keywords and they cannot be used as identifiers.
abstype and andalso as case datatype do else end eqtype exception fn fun functor handle if in include infix infixr let local nonfix of op open orelse raise rec set sharing sig signature struct structure then type val where while with withtype ( ) [ ] { } , : :> ; ... _ = => -> #
⟨alphaId⟩ defined in the identifier section does not contain these keywords.
SQL keywords The following are keywords used in SQL expressions and cannot be used as identifiers within SQL expressions.
asc all begin by commit cross default delete desc distinct fetch first from group inner insert into is join limit natural next not null offset only on or order rollback row rows select set update values where
97
These strings are not among the ⟨alphaId⟩ in the SQL expression that begins with the _sql keyword introduced in Section 19 and defined in Chapter 22. SQL expressions are not expressions in the definition of Standard ML, this restriction preserves the backward compatibility.
Extended keywords The following names started with _ are keywords used to represent SML# special features. Since they are not lexical items in the definition of Standard ML, introducing them preserves the backward compatibility.
__attribute__ _builtin _foreach _import _interface _join _dynamic _dynamiccase _polyrec _require _sizeof _sql _sqlserver _typeof _use
Identifiers Identifiers are names used in programs. SML# has the following 7 classes of identifiers defined below. Their classes are determined from their occurring context, so the same name can be used in different identifiers.
name usage
⟨vid⟩ variables, data constructors
⟨lab⟩ record labels
⟨strid⟩ structure names
⟨sigid⟩ signature names
⟨funid⟩ functor names
⟨tycon⟩ type constructors
⟨tyvar⟩ type variables
これら識別子の構造は以下の通りである.
⟨vid⟩ ::= ⟨alphaId⟩ | ⟨symbolId⟩
⟨lab⟩ ::= ⟨alphaId⟩ | ⟨string⟩ | ⟨decimal⟩ | ⟨decimal⟩ _ ⟨alphaId⟩ (note *1)
⟨strid⟩ ::= ⟨alphaId⟩
⟨sigid⟩ ::= ⟨alphaId⟩
⟨funid⟩ ::= ⟨alphaId⟩
⟨tyvar⟩ ::= ’ ⟨alphaId⟩ | ’’ ⟨alphaId⟩
⟨tycon⟩ ::= ⟨alphaId⟩ | ⟨symbolId⟩
Note.
1. There are the following three kinds of record labels: character string labels(⟨alphaId⟩ or
⟨string⟩ ),integer labels( ⟨decimal⟩ ), ordered character string labels([1-9][0-9]* _ ⟨alphaId⟩ ).
In SML#, record fields are sorted according to the ordering of their labels. Character string labels are ordered by String.compare. Integer labels are ordered according to the integer they represents. Ordered character string labels are order by the lexicographical pairing of integer labels and character string labels.
The definition of these character classes are given below.
⟨alpha⟩ ::= [A-Za-z\127-\255]
⟨symbol⟩ ::= !| % | & | $ | + | / | : | < | = | > | ? | @ | | ‘ | | | # | - | ^ | \
⟨alphaId⟩ ::= ⟨alpha⟩ ( ⟨alpha⟩ | [0-9] | ’ | _ )* (Note 1)
⟨decimal⟩ ::= [1-9] [0-9]*
⟨symbolId⟩ ::= ⟨symbol⟩ * (Note 2)
Note.
1. The character class ⟨alphaId⟩ does not contain keywords. Furthermore, in SQL expressions that begins with _sql, ⟨alphaId⟩ does not contain SQL keywords.
2. ⟨symbolId⟩ does not contain keywords. Therefore ==> is an instance of ⟨symbolId⟩ but => is not.
17.2. LEXICAL ITEMS 99 Long identifiers For ⟨vid⟩ (variable names and constructor names) and ⟨strid⟩ (structure names), the
following long identifiers are defined.
⟨longVid⟩ ::= (⟨strid⟩ .)* ⟨vid⟩
⟨longTycon⟩ ::= (⟨strid⟩ .)* ⟨tycon⟩
⟨longStrid⟩ ::= (⟨strid⟩ .)* ⟨strid⟩
constant literals ⟨scon⟩ Syntax for constant literals are given below.
⟨scon⟩ ::= ⟨int⟩ | ⟨word⟩ | ⟨real⟩ | ⟨string⟩ | ⟨char⟩ Constant literals
⟨int⟩ ::= (~)?[0-9]+ Decimal integers
| (~)?0x[0-9a-fA-F]+ Hexadecimal integers
⟨word⟩ ::= 0w[0-9]+ unsigned decimal integers
| 0wx[0-9a-fA-F]+ unsigned hexadecimal integers
⟨real⟩ ::= (~)?[0-9]+ . [0-9]+ [Ee](~)?[0-9]+ Floating-point numbers
| (~)?[0-9]+ . [0-9]+
| (~)?[0-9]+ [Ee](~)?[0-9]+
⟨char⟩ ::= #" (⟨printable⟩ | ⟨escape⟩ ) " Character
⟨string⟩ ::= " (⟨printable⟩ | ⟨escape⟩ )∗ " String
⟨printable⟩ ::= characters except for \ and "
⟨escape⟩ ::= \a The warning character (ASCII 7)
| \b Backspace (ASCII 8)
| \t Horizontal tab (ASCII 9)
| \n New line character (ASCII 10)
| \v Vertical tab (ASCII 11)
| \f Home feed (ASCII 12)
| \r Carriage return (ASCII 13)
| \^[\064-\095] Control character represented by [\064-\095]
| \\ The \ character
| \" The " character
| \ddd The character of decimal code ddd
| \f · · · f\ ignoring whitespace characters f· · · f
| \uxxxx The string of the UTF-8 character of hexadecimal code xxxx
Chapter 18
Types
This chapter defines the syntax for types and describes the built-in types.
Types are divided into monotypes (⟨ty⟩ ) and polytypes( ⟨polyTy⟩ ). The syntax for mono types are given below.
⟨ty⟩ ::= ⟨tyvar⟩ type variable names
| {(⟨tyrow⟩ )?} record types
| ⟨ty1⟩ * · · · * ⟨tyn⟩ tuple types (n≥ 2)
| ⟨ty⟩ -> ⟨ty⟩ function types
| (⟨tySeq⟩ )? ⟨longTycon⟩ (parameterlized) datatypes
| (⟨ty⟩ )
⟨tyrow⟩ ::= ⟨lab⟩ : ⟨ty⟩ (, ⟨tyrow⟩ )? record field types
⟨tyvar⟩ are type variable names. As defined in Section 17.2, they are written as ’a,’foo or ’’a,
’’foo. The latter form are for those equality type variables, which range only over types that admit equality. An type admits equality, called eqtype, is any type that does not contain function type constructor and built-in types that does not admit equality. A user defined datatype is an eqtype if it only contains eqtypes. For example, τ list defined below is an eqtype if τ is an eqtype.
datatype ’a list = nil | :: of ’a * ’a list
The function type constructor -> associates to the right so that int -> int -> int is interpreted as int -> (int -> int).
⟨longTycon⟩ is a type constructor names defined by datatype declarations. Atomic types such as int are type constructors without type parameters (⟨tySeq⟩ ). SML#4.0.0 supports the following built-in atomic types and type constructors.
type constructor name description eqtype?
int 32 bit long signed integers Yes
int64 64 bit long signed integers Yes
int16 16 bit long signed integers Yes
int8 8 bit long signed integers Yes
intInf unbounded signed integers Yes
word 32 bit long unsigned integers Yes word64 64 bit long unsigned integers Yes word16 8 bit long unsigned integers Yes word8 8 bit long unsigned integers Yes
real floating point numbers) No
real32 32 floating point numbers No
char characters Yes
string strings Yes
exn exceptions No
unit unit values (()) Yes
τ ref references (pointers) Yes
τ array arrays Yes
τ vector vectors τ が eq 型ならば Yes
101
The unit type is distinguished from the empty record type {} as a different type. This is the modification that is not backward-compatible to Standard ML.
The syntax for polytypes (⟨polyTy⟩ ) are given below.
⟨polyTy⟩ ::= ⟨ty⟩
| [ ⟨boundtyvarList⟩ . ⟨ty⟩ ]
| ⟨ty⟩ -> ⟨polyTy⟩
| ⟨polyTy⟩ * · · · * ⟨polyTy⟩
| { (⟨polyTyrow⟩ )? }
⟨boundtyvarList⟩ ::= ⟨boundtyvar⟩ (, ⟨boundtyvarList⟩ )?
⟨boundtyvar⟩ ::= ⟨tyvar⟩ ⟨kind⟩
⟨kind⟩ ::=
| #{ ⟨tyrow⟩ }
| # ⟨kindName⟩ ⟨kind⟩
⟨kindName⟩ ::= boxed | unboxed | reify | eq
⟨polyTyrow⟩ ::= ⟨lab⟩ : ⟨polyTy⟩ (, ⟨polyTyrow⟩ )?
• [ ⟨boundtyvarList⟩ . ⟨ty⟩ ] is the polymorphic type that makes the scope of bounded type variables
⟨boundtyvarList⟩ explicitly.
• A bound type variable may have the following kind constraints ⟨kind⟩ . The record kind #{ ⟨tyrow⟩
} restricts the range of the bound type variable to record types that have at least fields indicated by ⟨tyrow⟩ . The boxed kind restricts it to the type of heap-allocated values. The unboxed kind is the complement of the boxed kind. The eq kind restricts it to eqtypes. The reify kind is just an annotation that the type reification feature is required and therefore does not restrict the range of the bound type variable.
The set of polytypes is the extension of the set of polytypes in the definition of Standard ML with record polymorphism, overloading and rank-1 polymorphism.
The following examples use a rank-1 polytype.
# fn x => (fn y => (x,y), nil);
val it = fn : [’a. ’a -> [’b. ’b -> ’a * ’b] * [’b. ’b list]]
In the interactive session, in addition to the above kinds, you may see the overload kind :: { ⟨tyList⟩
}, which restricts the range of the bound type variable to ⟨tyList⟩ . The current system restricts over-loading to system defined primitives, and type variables with overload kind are not allowed in a user program.
Chapter 19
Expressions
The syntax for expressions (⟨exp⟩ ) is hierarchically defined below using infix operator expressions (⟨infix⟩ ), function application expressions ( ⟨appexp⟩ ), atomic expressions ( ⟨atexp⟩ ).
• expressions (top-level)
⟨exp⟩ ::= ⟨infix⟩
| ⟨exp⟩ : ⟨ty⟩
| ⟨exp⟩ andalso ⟨exp⟩
| ⟨exp⟩ orelse ⟨exp⟩
| ⟨exp⟩ handle ⟨match⟩
| raise ⟨exp⟩
| if ⟨exp⟩ then ⟨exp⟩ else ⟨exp⟩
| while ⟨exp⟩ do ⟨exp⟩
| case ⟨exp⟩ of ⟨match⟩
| fn ⟨match⟩
| _import ⟨string⟩ : ⟨cfunty⟩ importing C function
| ⟨exp⟩ : _import ⟨cfunty⟩ importing C function
| _sizeof( ⟨ty⟩ ) size of type
| _dynamic ⟨exp⟩ as ⟨ty⟩ Dynamic type cast
| _dynamiccase ⟨exp⟩ of ⟨match⟩ case branches with dynamic type cast
| _sqlserver (⟨appexp⟩ )? : ⟨ty⟩ SQL servers
| _sql ⟨pat⟩ => ⟨sqlfn⟩ SQL execution function
| _sql ⟨sql⟩ SQL query fragments
⟨match⟩ ::= ⟨pat⟩ => ⟨exp⟩ (| ⟨match⟩ )? pattern matching
• infix operator expressions
⟨infix⟩ ::= ⟨appexp⟩
| ⟨infix⟩ ⟨vid⟩ ⟨infix⟩
• function application expressions
⟨appexp⟩ ::= ⟨atexp⟩
| ⟨appexp⟩ ⟨atexp⟩ left associative function applications
| ⟨appexp⟩ # { ⟨exprow⟩ } record field updates
• atomic expressions
103
⟨atexp⟩ ::= ⟨scon⟩ constants
| (op)? ⟨longVid⟩ identifiers
| {(⟨exprow⟩ )? } records
| (⟨exp1⟩ ,· · ·, ⟨expn⟩ ) tuples (n≥ 2)
| () unit value
| #⟨lab⟩ record field selector
| [⟨exp1⟩ ,· · ·, ⟨expn⟩ ] lists (n≥ 0)
| (⟨exp1⟩ ;· · ·; ⟨expn⟩ ) sequential execution
| let ⟨declList⟩ in ⟨exp1⟩ ;· · ·; ⟨expn⟩ end local declarations
| _sql (⟨sql⟩ ) SQL query fragments
| (⟨exp⟩ )
⟨exprow⟩ ::= ⟨lab⟩ = ⟨exp⟩ (, exprow)? record fields
The definitions for ⟨cfunty⟩ is given in Section 19.21 and those for ⟨sql⟩ and ⟨sqlfn⟩ are given in Chapter 22.
The above hierarchical definition for expressions represents associatibity among expression construc-tors. The associatibity of infix operator expressions ⟨infix⟩ are determined not by syntax but by the infix operator declarations. In the following sections, we first define in the next (19.1) elaboration rules of infix expressions. In the following sections, we define each of expression constructors and their types in the order of associatibity.
19.1 Elaboration of infix expressions
The following infix declamations give identifiers in the ⟨vid⟩ class infix operator property.
infix (n)? ⟨vidSeq⟩
infixr (n)? ⟨vidSeq⟩
infix defines ⟨vidSeq⟩ as left associative infix operators and infixr defines ⟨vidSeq⟩ as right associative infix operators. Optional integer (n)? (from 0 to 9) specifies association strength (with 9 the strongest).
If n is omitted then 0 is assumed. Declaration nonfix ⟨vidSeq⟩
cancel infix operator property of identifiers ⟨vidSeq⟩ .
Infix expressions are converted to applications to tuples according to the association strength.
source result
⟨exp1⟩ ⟨vid⟩ ⟨exp2⟩ op ⟨vid⟩ ( ⟨exp1⟩ , ⟨exp2⟩ )
⟨pat1⟩ ⟨vid⟩ ⟨pat2⟩ op ⟨vid⟩ ( ⟨pat1⟩ , ⟨pat2⟩ ) The syntax, when appears in expressions and patters,
op ⟨vid⟩
cancel the infix status of ⟨vid⟩ . Therefore if the identifier foo has infix status, then the following two code fragments are equivalent.
1 foo 2 op foo (1,2)
The following are implicitly declared in all the compilation unit and the interactive mode.
infix 7 * / div mod
The above hierarchical syntax with infix declarations determines the association strength of ex-pressions. For example, record update expression (⟨exp1⟩ # { ⟨lab⟩ = ⟨exp2⟩ }) associates tightly than
strongest infix operators (those that are declared with infix 9), and expression constructs if ⟨exp⟩ then ⟨exp⟩ else ⟨exp⟩
and others associates weakly than weakest infix operators (those that are declared with infix 0.