Static Semantics and Inter-Document Consistency in SSL: SSL oers a means of dening static semantics of languages in terms of attributes and equations Attributes are
6.6.2 METAL, TYPOL and PPML
The Centaur system evolved from the Mentor-Project [DGKLM84] that was carried out at INRIA between 1974 and 1986. Mentor, as well as its successor Centaur, are meant to sup- port the prototyping of languages. Therefore, they provide a framework which can be used to dene a language in terms of syntax, static and even dynamic semantics. To do this, Centaur oers dierent languages. Abstract and concrete syntax is dened using the lan- guage METAL [KLM83]. Unparsing schemes are dened in the pretty-printing meta language (PPML). Static semantics and also dynamic semantics can be dened in a rule-based language which is called TYPOL [Des88]. TYPOL is based on structural operational semantics [Plo81].
Abstract Syntax in Metal:
Similar to SSL, METAL supports the denition of the abstract syntax of a language in terms of phyla and operators. Identiers in lower-case letters denote operators and phyla identiers are given in upper-case letters. An operator declaration consists of two components delimited by ->. The left-hand side denes the name of the operator andthe right-hand side denes a list of argument phyla. A phylum is declared by an identier followed by ::= and a list of operators, which can be applied to the phylum. For a better
comparison with GTSL and SSL, consider the METAL fragment below. It denes an excerpt of the abstract syntax of the Groupie interface language.
definition of MIE is abstract syntax
MODULE ::= adt_module ado_module f_module tc_module; COMMENT ::= comment comment_nil;
OP_LIST ::= op_list;
OPERATION ::= function procedure
adt_module -> IDENT COMMENT IDENT OP_LIST IMPORT_PART IDENT; ado_module -> IDENT COMMENT OP_LIST IMPORT_PART IDENT; f_module -> IDENT COMMENT OP_LIST IMPORT_PART IDENT; tc_module -> IDENT COMMENT IDENT_LIST IDENT;
function -> IDENT PAR_LIST IDENT COMMENT; procedure -> IDENT PAT_LIST COMMENT; op_list -> OPERATION * ...; comment -> implemented as STRING; comment_nil -> implemented as SINGLETON; ...
List phyla declarations (such as OP LIST) are easier in METAL and GTSL than in SSL. Also
place holder denitions are implicit in METAL and GTSL while they have to be explicitly specied in SSL. As in SSL, common properties of operators cannot be specied in METAL as such. Instead they have to be duplicated for each operator. GTSL overcomes this by the inheritance of abstract syntax children.
Concrete Syntax in Metal:
The Centaur system distinguishes between the denition of the input syntax and the output representation of an abstract syntax tree. The input syntax is dened in METAL, whereas the output representation is dened in a dedicated language (PPML). The concrete syntax in METAL is dened in terms of rules. A rule in METAL resembles a production in context-free grammars, except that it adds a call to a tree building function. This call denes the relationship to the abstract syntax denition. As an example, consider the concrete syntax denition excerpt below.rules
<module> ::= #DATATYPE #MODULE <ident> #; <comment> <ident><op_list><import_part> #END #MODULE <ident>#.;
adt_module(<ident>.1,<comment>,<ident>.2, <op_list>, <import_part>,<ident>.3)
<comment> ::= %STRING ;
comment-atom(%STRING) ...
Besides dening the input syntax a METAL specication must determine those increments that can be edited in free textual input mode. Therefore, METAL provides the means to dene entry points for the multiple entry parser. Entry point specications are also given in a rule based format. The left-hand side of the rule denes the axiom of the grammar. The right- hand side consists of the name of the entry phylum and the respective non-terminal symbol of the concrete syntax specication. Phyla MODULE, COMMENT and OP LIST of the example are
declared as entries below. These entry point denitions correspond to interaction denitions in GTSL, which dene free textual input commands.
rules
<module> ::= [COMMENT] <comment> ;
<module> ::= [EXPORT_PART] <export_part> ; <module> ::= [OP_LIST] <op_list> ;
...
In short, the way of dening the abstract and concrete syntax of a language as well as the entry points for a parser is rather lengthy. This is because information about the abstract syntax is redundantly contained in the BNF-like rule denition of the concrete syntax and, similarly, information about the BNF-like rule denition is also contained in the entry point denition. As a consequence, changes to the syntax may cause tedious updates of the METAL denition.
Unparsing Schemes in PPML:
Unparsing schemes are dened in Centaur using the pretty-printing meta language (PPML). They are called prettyprinters. More than one pret- typrinter may be dened for a tool. A tool user can then switch at run-time between the dierent prettyprinters in order to change the way documents are displayed. A PPML speci- cation consists of a set of rules that map operators, dened in the abstract syntax denition, to a textual output representation. A rule has the formpattern -> [format]. A pattern rep-resents an operator with formal parameters. A format denition denes how these parameters are positioned and interleaved with text, such as keywords and symbols. The formatting is based on the notion of a box. Each leaf of the abstract syntax tree and each text is considered to form an atomic box. Atomic boxes are glued to compound boxes by square brackets in a formatting specication. In doing so, separator denitions between boxes given in pointed
brackets dene how a box is aligned. As an example consider an excerpt of a prettyprinter for Groupie interface denitions below:
prettyprinter Standard of MIE is rules
adt_module(*ident1, *comm, *ident2, *op_list, *import_part, *ident3) -> [<v 0,0> [<h> "DATATYPE MODULE " *ident1 ";"]
<v 4,1> *comm <v 4,1> [<v 0,0> "EXPORT INTERFACE" <v 2,1> [<h> "TYPE " *ident2] <v 2,1> *op_list ] <v 4,2> *import_part
<v 0,1> [<h> "END MODULE " *ident3 "."] ] op_list(*operation, **op_list) -> [<v 0,0> *operation(**oplist) ] ... end prettyprinter
The rst rule denes the unparsing of a complete ADT module. A module is composed of ve vertically aligned boxes. The rst box contains the module head. It is itself composed of three atomic boxes, which are horizontally aligned. The second box is an atomic box that contains a module's comment. It has a horizontal indentation, to the outer box, of four spaces. Moreover, a blank line is inserted before the comment. Similar denitions dene alignments for the type identier, operation lists, import interface and the module tail. The unparsing scheme denition is very similar to both GTSL and SSL. PPML is more powerful in this respect since it supports the denition of several prettyprinters, which GTSL does not for reasons that have already been discussed.
Static Semantics in TYPOL:
A TYPOL specication is based on a METAL abstract syntax denition. This syntax denition is imported with a use directive. A TYPOL speci- cation then consists of sets of rules. Each set of rules is a named collection of inference rules. These rules dene a formal system in which it is possible to prove that a particular proposition holds. The declaration part of a set contains a judgement which is a signature denition for the proposition to be proved by the set. Each of the inference rules has two parts called numerator and denominator. The general form of a rule is:<nominator> --- <denominator>
The denominator is a sequent and the nominator is a list of sequents also called premises. In natural semantics sequents express the fact that some hypotheses are needed to prove a particular proposition. A sequent, therefore, has two parts which are delimited by the turnstile symbol `. The rst part of the sequent contains the hypotheses and the second part is called
consequent. The consequent of the denominator sequent is called subject of the rule. Sequents are built from lists of expressions which are, in turn, formed from variables and operators dened in the METAL abstract syntax specication. Premises are then formed by a list of sequents separated by the ampersand sign &.
program SCOPING_MIE is set MODULE_OK is
judgement MODULE |- MODULE;
NAME_OK(MOD|-ID2) & TYPE_OK(MOD|-ID2) & OPLIST_OK(MOD|-OPL) & IMPORT_OK(MOD|-IM) --- MOD |- adt_module(ID1,_, ID2, OPL, IM, ID3);
...
END MODULE_OK; set NAME_OK is
judgement MODULE |- IDENT;
EQUAL(ID1|-ID3) & NOT_EQUAL(ID1|-ID2) & NOT_IN_OPLIST(OPL|-ID1) --- MOD |- adt_module(ID1,_, ID2, OPL, IM, ID3);
...
end EXPORT_OK; set NOT_IN_OPLIST is
judgement OP_LIST |- IDENT; op_list[] |- _;
NOT_EQUAL(OPID|-ID) & NOT_IN_OPLIST(TAIL|-ID) --- op_list[procedure(OPID,_,_).TAIL] |- ID;
NOT_EQUAL(OPID|-ID) & NOT_IN_OPLIST(TAIL|-ID) --- op_list[function(OPID,_,_,_).TAIL] |- ID; end OP_LIST_OK;
...
end SCOPING_GRE;
Figure 6.17: Excerpt of a TYPOL Specication Dening a Scoping Rule
Intuitively, an inference rule states that if all the sequents in the premises hold, the proposition expressed by the denominator sequent holds. The order of the premises in a rule is not important. An example of a TYPOL specication is shown in Figure 6.17. It denes a fragment of the scoping rules for the Groupie interface language.
The rst rule set denes the overall correctness of modules. The subject of the rst of its rules is determined by the METAL operator adt module. Thus, this rule denes static semantics of
modules that are ADT modules. Such a module is correct if the sequents in the nominator part can be proved. These require the ADT module's name, its type, the operation list and the import to be correct. Very similar rules are dened for the other module types, but they are omitted here for reasons of brevity. In order to prove the correctness of the module name, the judgement dened in the rule set NAME OK must hold for the match determined by MODULE OK.
Thus, we must be able to deduce the module identier from the module hypothesis. In order to prove this, the three sequents in the nominator part of the rule must hold. Informally, this means that the identier in the module's head must be equal to the identier in the tail. Moreover, this identier must be dierent from the exported type name. The third sequent requires that the module identier must not occur in the operation list. While the rule sets for the rst two sequents are trivial, the set to be proved for the third sequent requires a list
traversal. This is dened in rule set NOT IN OPLIST. It consists of three rules. The rst is
an axiom that always holds. It denes that the empty operation list does not contain any identiers. The second rule denes that the identier does not occur in the list if the rst element is a procedure, its name is unequal to the identier and the identier does not occur in the tail of the list. The third rule denes the same for functions.
TYPOL provides a powerful means for dening scoping rules and the type system of a language. The tool builder can dene them in a declarative way and abstract from particular execution dependencies. Resolution of these dependencies is implemented in the TYPOL compiler by mapping a TYPOL program to a Prolog program which exploits backtracking in order to nd a valid execution sequences of rules.
Just as deciencies were identied for SSL in the previous section, TYPOL is not capable of expressing inter-document consistency constraints. Again the reason is that the universe over which expressions can be dened in TYPOL is determined by the abstract syntax specication of one language.
In addition, TYPOL rules are not as concise as they could be. Consider the last two rules in set
NOT IN OP LIST. They only dier in the operator that is applied to the rst list element in order
to bindOPID. Similarly, the denitions of rules in setMODULE OKonly dier regarding the checks
of exports since all module types have imports and names. In GTSL, this deciency is removed, since semantic rules are inherited by subclasses. The rule for module name correctness, for instance, is specied in classModuleand inherited by all subclasses.
A further drawback of TYPOL is that it can only dene correctness conditions for static se- mantics. It is not possible to dene error messages to be displayed if an error is detected. Hence, it is particularly dicult for a user who may be unfamiliar with a language to under- stand why a sentence is considered wrong. Moreover, in TYPOL, it is not possible to dene the means that would prevent the introduction of errors, such as change propagation to depen- dent increments. GTSL diers from that in that it includes a number of exible mechanisms that can be used to dene error messages and also that the strategy for accepting or rejecting erroneous input can be dened in interactions.