RIT Scholar Works
Theses
Thesis/Dissertation Collections
5-1983
Development of syntax-directed editor for Pascal
Chintana Promphan
Follow this and additional works at:
http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion
in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact
Recommended Citation
Rochester Institute of Technology
School of Computer Science and Technology
DEVELOPMENT CF SYNTAX-DIRECTED EDITOR
FOR PASCAL
by
CHINTANA PROMPHAN
A thesis, submitted to
.
The Faculty of the School of Computer Science and technology,
in partial fulfillment of the requirements for the degree of
Master of Science in Computer Science
Approved by:
I,
Chlnta~a
Prompban, hereby grant permission to the Wallace
Me~orial
Library, of 3IT, to reproduce my thesis in whole or
in part. Any reproductior. viII not be for
com~ercial
use
or
profit.
ACKNOWLEDGEMENTS
I
wouldlike
to
thank
my
thesis
advisor,Dr.
Lawrence
A.
Coon,
?or
giving
methe
mainidea
cfthis
thesis
andfor
the
guidance andinformation
he
gave rrethroughout
<the
course of
it.
Without
his
help,
this
thesis
could nothave
been
completed.I
would alsolike
to
thank
Mr.
Warren
R.
Carithers
for
the
time
he
spentcorrecting
my
writing
andfor
being
a member cfmy
committee.Lastly,
I
wishto
thank
Dr.
Peter
G.
Anderson
for
the
time
he
spent as a member ofA
syntax-directed editorfor
Pascal
is
createdby
using
an
ALOE
(A
Language
Oriented
Editor)
implementation
tool
running
underthe
UNIX*Operating
System.
An
ALOE
supportsthe
construction and manipulation oflanguage
structureswhile
guaranteeing
syntactic correctness.Instantiations
cf anALOE
for
Pascal
are generatedfrom
a grammaticaldescrip
tion
of alanguage.
Trees
are representedinternally by
ALOE
as an abstract syntax
trees
that
the
user manipulatesdirectly,
andthen
mapsthis
internal
representationto
aconcrete representation
for
display
tc
the
user-Key
Words
andPhrases:
syntax-directededitor,
language
oriented
editor,
structureeditor,
template,
constructivecommand
TAELE
OF
CONTENTS
1.
INTRODUCTION
1
1.1.
Goal
cfThesis
3
1.2.
Disadvantages
ofText-Oriented
Editors
....3
1.3.
Advantages
ofText-Oriented
Editors
4
1.4.
Advantages
ofSyntax-Directed
Editors
5
1.5.
Disadvantages
ofSyntax-Directed
Editors
6
1.6.
Previous
Work
7
2.
PROJECT
DESCRIPTION
11
2.1.
The
ALOE
Implementation
Tool
11
2.2.
Screen
Organization
14
2.3.
The
ALOE
Generator
15
2.3.1.
The
grammarDescription
15
2.3.1.1.
Language
Name
16
2.3.1.2.
Language
Root
16
2.3.1.3.
Terminal
Productions
17
2.3.1.4.
Non-terminal
Productions
21
2.3.1.5.
Classes
23
2.4.
Building
anALOE
for
Pascal
24
2.5-
Internal
Representation
ofSyntax
Tree
33
3.
USFR
INTERFACE
36
3.3.
Creating
aProgram
39
3.4.
Moving
the
Cursor
58
3.5.
Modifying
the
program72
4.
SUMMARY
76
4.1.
Missing
Features
76
4.2.
Enhancements
78
BIBLIOGRAPHY
79
APPENDIX
A:
ALOE
GRAMMAR
FOR
PASCAL
83
APPENDIX
B:
ALOE
EDITING
COMMANDS
109
B.l.
Cursor
Movement
109
B.2.
Searching
Commands
Ill
B.3.
Help
Information
114
B.4.
Tree
Manipulation
114
B.5.
Move
to
Marked
Node
118
B.6.
Input/Output
120
B.7.
Exit
ALOE
121
B.8.
Display
Manipulation
121
B.9.
Other
Commands
123
APPENDIX
C:
CONSTRUCTIVE
COMMANDS
125
APPENDIX
D:
TERMINAL
NODE
TYPES
AND
UNPARSING
SCHEME
COMMANDS
134
D.l.
Terminal
Node
Types
134
D.2.
Unparsing
Scheme
Commands
135
LIST
OF
FIGURES
Figure
2.1.
Construction
of anIF
statement34
Figure
2.2.
Editor
session35
Figure
3.1.
Nesting
an'assignment'
statement
into
a"while"
statement
73
Figure
3.2.
Nesting
an additioninto
amultiplication
74
Figure
3.3.
Transformation
of a"while"
statement
into
an"if"
statement
74
Figure
3.4.
Transformation
of an"if"
statement
into
an
"if
then
statement75
INTRODUCTION
In
the
past,
most program editors werebased
onthe
textual
viewpoint.These
editors executed commands whichwould
interact
with a program as atwo
dimensional
page ofcharacters.
Such
commandsinclude
moving
the
cursor aroundvertically
andhorizontally
onthe
page,
adding
anddeleting
characters,
lines,
and regions oftext.
A
number of editors are nowappearing
which arebased
on
the
parsetree
viewpoint.These
editors execute commandswhich
include
moving
the
cursor aroundin
a syntaxtree,
deleting
subtrees,
andinstantiating
templates
for
subtrees.The
idea
of syntax-directed editoris
to
superimpose atree
structure onto
the
text.
Changes
in
the
tree
structure willcause
the
text
partsto
be
correspondingly
reordered.Syntax-directed
editorshelp
users create programsby
preventing
the
entry
ofsyntactically
incorrect
programs.They
offer abbreviationsfor
verboseconstructs,
anddisplay
programs
in
a pleasant, consistentfashion.
Language-oriented
programming
environments aredesigned
around
the
language
andthus
canhelp
and cooperate withthe
programmer
in
his
task.
In
the
process ofcreating,
modify
able
to
concentrate onhis
algorithms and onthe
structureof
his
systems,
instead
of ontheir
specifictextual
representation.
Syntax-directed
editors arelanguage-oriented
editorsin
whichthe
programs and systems are created and modifiedaccording
to
the
syntactic structure ofthe
language
instead
of characters and
lines.
Some
ofthe
important
characteristics
of syntax-directed editors arethe
following
[Medi82]
:A
constructive approach.Programs
and systems arebuilt
through
constructive commands.Each
commandcorresponds
to
a construct ofthe
language.
The
bur
den
ofgenerating
the
concrete syntaxis
taken
overby
the
editor.Manipulation
Q.L
Programs and systemsin
terms
ofi^il
structure.The
programmer operates onhis
program or systemin
terms
ofits
structure notits
textual
representation.i^S-tract
S-ZlUa^
trees.
Programs
are representedinternally
as syntaxtrees
that
arebuilt
by
the
editor
from
constructive commands createdby
the
programmer-
These
trees
are abstract syntaxtrees,
whose nodes represent
the
language
constructs.The
elements of
the
concrete representation oflanguages,
such askeywords,
punctuationmarks,
separators,
terminators,
etc.,
are notkept
in
the
tree.
!y_ntax
tables.In
syntax-directed editors generatedfrom
language
descriptions,
the
syntacticknowledge
of
the
languages
obtainedfrom
these
descriptions
is
kept
by
each editorin
a collection of syntaxtables.
The
editors usethese
tables
to
ensurethe
syntactic correctness of
the
programsbeing
built.
Unparsing.
The
concrete representation of strucschemes,
specifiesthe
visual representation ofthe
construct .
Ustform
interface
for
the
nvilP.9.int5i
The
userin
terface
of a syntax-directed editor providesthe
means
for
communicationbetween
user and environment.
1.1.
Goal
ofThesis
The
purpose ofthis
thesis
is
to
design
the
grammaticaldescriptions
to
generate a syntax-directed editorfor
Pascal
[Jens74j
by
using
anALOE
[MediSlb]
(A
Language
Oriented
Editor)
implementation
tool
running
underthe
UNIX*Operat
ing
System.
!?.
Disadvantages
ofText-Oriented
Editors
Richard
Waters
[Wate82]
statesthat
there
seemto
be
three
principal arguments which are usedto
supportthe
idea
that
eliminating
text
oriented commandsis
eithernecessary
or
intrinsically
beneficial.
(1)
The
textual
viewpointis
obsolete.The
structure
oriented commands are much more convenient
to
use.As
a
result,
nobody
would wantto
usetext
oriented commands
any
more.(2)
The
textual
viewpointis
dangerous.
It
may
be
incon
venient
to
use structure oriented commands all ofthe
time?
however,
this
inconvenience
is
morethan
justi
fied
by
the
fact
that
this
makesit
impossible
to
create
syntactically
incorrect
programs andtherefore
reduces errors.
(3)
The
textual
viewpointis
incompatible
withthe
structure
oriented viewpoint.The
implementation
problemsthat
wouldbe
createdby
having
both
structure orientedcommands and
text
oriented commands are so severethat
the
text
commands cannotbe
retainedin
a practicalsystem.
i3.
Ady.rit.e_s_
ofText
-OrientedEditors
Waters
[Wate82]
arguesthat
each ofthese
above arguments
is
fallacious.
(1)
Structure
oriented commandsdo
many
things
better,
but
not everything.
The
fact
that
programs aredisplayed
astwo
dimensional
images
composed of characters makesthe
textual
viewpointvery
natural.It
is
convenientto
be
able
to
enter simple expressions as strings oftext
rather
than
build
them
up
hierarchically
astrees.
(2)
While
a programis
being
modifiedby
text
oriented comparsed.
This
canbe
dealt
withby
reparsing
the
programafter groups of
text
editor commands and/orby
extending
the
parsing
algorithm sothat
it
cantolerate
certain
kinds
of syntactic errors.(3)
It
is
true
that
some syntactic errorsbecome
impossible
if
all program modifications are performedstrictly
from
the
point of view ofthe
parse tree-However,
this
is
only
true
for
local
properties.More
global properties
such aschecking
that
every
variablehas
adeclaration,
orthat
variables are usedin
waysthat
do
not conflict with
their
declarations
mustbe
checkedwhenever
the
programmer changesthe
program.1.4.
Advantages
ofSyntax-Directed
Editors
(1)
Entry
and modification of programtext
are guidedby
agrammar
for
the
host
programming
language.
The
incor
poration of
the
grammarinto
the
editor guarantees syntactically
correct programs;there
is
no needfor
syntactic
error repairbecause
such errors are preventedon
entry
[TeitSlb]
.(2)
Editing
commands canbe
appliedto
chunks ofthe
program as
defined
by
intermediate
nodes ofthe
parseprogram with reference
to
the
parsetree
and subtreescan
be
copied anddeleted
[Wate82]
.(3)
The
editor can checkfor
variousfeatures
of syntacticcorrectness as
the
programmer edits or entersthe
program.
For
example,
it
car checkthat
the
programis
consistent with
the
programming
language
grammar andthat
each variablehas
adeclaration
[Wate82]
.(4)
The
syntax-directed editor reducestyping
effortby
providing
abbreviationsfor
frequently
occurring
text
elements such as
keywords
andby
taking
care oftext
formatting
[TeitSlb],
(5)
A
language-directed
editor canform
the
basis
of anintegrated
set ofprogramming
tools
to
support allphases of program
development.
The
editor representsthe
syntactic structure ofthe
programtext
in
a parsetree,
andthis
tree
cansubsequently
be
the
subject ofan
interpreter,
debugger,
and/or code generator.Ulti
mately
compilation as a separate anddistinct
step
canbe
eliminated[TeitSlb],
1.5.
Disadvantages
ofSyntax-Directed
Editors
(1)
They
are profligate consumers of computerresources,
to
the
extentthat
they
areusually
designed
for
single(2)
Parsing
usually
consumesprocessing
power,
the
parsetree
devours
storage,
andthere
is
no solutionbut
to
supply
plenty
of each[MorrSl]
.!
previous
Work
There
are a number of program editors which operate onthe
basis
ofthe
parsetree
viewpoint.These
include
the
standard
Interlisp
editor[Teit7S],
Mentor
[Donz75]
,The
Cornell
Program
Synthesizer
[TeitSla],
C-andalf
[Medi81a]
,PASES
[ShapSl]
,CAPS
[Wilc76]
, andBarstow's
display
oriented
Interlisp
editor[Ears8lj
.Each
ofthese
editors more orless
totally
discards
the
textual
viewpoint.The
Cornell
Program
Synthesizer
is
aninteractive
Pro
gramming
environment withintegrated
facilities
for
creation,
editing,
executing,
anddebugging
programs.The
Syn
thesizer
stimulates program conception at ahigh
level
ofabstraction,
promotesprogramming
by
stepwiserefinement,
and spares
the
userfrom
the
frustrations
associated withsyntactic
detail.
The
Synthesizer's
editoris
ahybrid
between
atree
editor and a
text
editor.Templates
are generatedby
comone character at a
time.
Errors
in
user-typedtext
aredetected
immediately
because
the
parseris
invoked
by
the
editor on a phrase-by-phrase
basis.
By
precluding
the
creation
ofsyntactically
incorrect
programs,
the
Synthesizer
lets
the
userfocus
onthe
intellectually
challenging
aspects cf programming.
The
language
implemented
for
the
Synthesizer
wasPL'/CS
[Conv76]
, aninstructional
dialect
ofPL/I
[Teit76].
Creating
a programin
the
Synthesizer
is
exactly
analogous
to
deriving
a sentence with respectto
a context-freegrammar
for
the
givenprogramming
language.
The
following
are
the
correspondences:Placeholder
template
command
insertion
file
cursor position
file
display
nonterminal symbol
right side of a production
the
name cf a productionderivation
derivation
tree
node of
derivation
tree
sentential
form
The
Synthesizer
is
notonly
a syntax-directededitor,
but
also aprogramming
environmentthat
integrates
interpre
tation
anddebugging
facilities.
The
program cursorin
the
Synthesizer
is
a single character cursor as opposedto
the
area cursor of
ALOE.
PASES
is
aninteractive,
residentialprogramming
an
interpreter,
both
operating
onthe
sameinternal
representation of
Pascal
structures.The
currentdesign
for
the
systemincludes
the
follow
ing
main modules:parser,
structureeditor,
text
editor,
display
andfile
pretty-printer,
semanticchecker,
inter
preter,
cedegenerator,
commandprocessor,
and atop-level
control structure
that
coordinatesthese
modules.The
basic
date
structurein
the
systemis
aPascal
syntaxtree.
This
is
an abstracted parsetree
generatedby
the
parser.Trees
of
this
type
are storedin
buffers.
A
buffer
is
essentially
a pointer
to
the
root of asyntax-tree,
with an externalname
that
canbe
referredby
the
user.The
predefinedbuffers
contain,
among
etherthings,
templates
for
the
mostuseful
Pascal
structures,
including
a programtemplate,
anequality
expressiontemplate,
etc.These
buffers
canbe
manipulated
by
the
structure editorin
the
process ofthe
top-down
writing
of a program.The
systemderives
almost all ofits
syntactic andsemantic
information
aboutPascal
from
an annotatedBNF-like
grammar.
The
parsing
tables
are generatedby
the
UNIX
program
YACC,
which usesthis
grammar as aninput.
The
MENTOR
system[Donz80]
is
a structured editorfor
10
of
the
language
althoughit
also supports someform
of constructive editing.
The
MENTOR
implementors
decided
to
sup
port
parsing
ofinput
because
users were more comfortablewriting
their
programs astext
as a result oftheir
experience with
text
editors.Users
entertheir
programs astext
but
mustdeal
withthem
structurally
for
editing
andinspection.
This
mightbe
confusing
if
the
user entershis
program astext,
but
editsit
as a structure.MENTOR
is
not anintegrated
system: aprogram must
be
unparsedinto
atext
file
before
being
compiled separately.
MENTOR
usesMENTOL,
atree
manipulationlanguage,
asits
commandlanguage.
MENTOL
is
usedto
build
routinesto
check context sensitive properties of
Pascal
programs andmanipulate searches
using
pattern matching.The
Interlisp
System
is
avery
sophisticated programming
systemfor
LISP.
Interlisp
incorporates
powerfulfacil
ities
like
structuredediting,
sophisticateddebugging
tech
niques,
automatic errorcorrection,
the
programmer's assistant
and others.Input
to
Interlisp
is
given astext
andit
is
parsed andinserted
atthe
current position.Editing
is
PROJECT
DESCRIPTION
Structure
editing
of asyntactically
richlanguage
like
Pascal
is
new enoughthat
actual experiments of a programming
environmentfor
Pascal
like
PASES
[Sbap81]
were neededto
understandits
implications.
Basic
questionsinclude
how
convenient
is
it
to
move with cursor commands on atree
structure rather
than
on a rectangularscreen,
orhow
natural
is
it
to
edit whenthe
scope ofthe
editing
commandsis
the
subtree ofthe
nodethe
cursoris
pointing
to
ratherthan
aletter
orline-2.1.
The
ALOE
Implementation
Tool
A
Language
Oriented
Editor
(ALOE)
wasfirst
developed
by
the
Department
ofComputer
Science
atCarnegie-Mellon
University
running
underthe
UNIX
Operating
System.
ALOE
is
a
tool
which supportsthe
construction and manipulation oflanguage
structures whileguaranteeing
syntactic correctness.
Instantiations
ofALOEs
are generatedfrom
a grammatical
description
of alanguage.
Trees
(i.e.,
programs)
arerepresented
internally
by
ALOEs
asabstract
syntaxtrees
that
the
user manipulatesdirectly.
A
simplelanguage
is
provided
for
mapping
this
internal
representation
to
a concrete representation
for
display
to
the
user.12
The
ALOE
generator produces a set oftables
that
define
the
language
knowledge
of eachALOE.
These
tables
are compiled and
then
loaded
withthe
rest ofthe
editorto
form
anew
ALOE.
The
ALOEs
so generated are syntaxdirected
editors.
This
meansthat
construction and modification of programs or structures
is
done
by
following
the
abstract syntaxof a
language
(which
defines
the
structure ofthe
language).
An
ALOE
can alsobe
visualized as a constructive editor.
Language
constructs(such
asvariables,
operators,
expression,
different
kinds
ofstatements)
canbe
added,
modified,
or removed.The
user communicates withALOE
in
terms
ofthese
language
constructs.The
programmer constructs
his
programby inserting
templates
representing
dif
ferent
language
constructs andthen
filling
the
"holes" of
those
templates
with othertemplates.
Since
ALOE
knows
whichconstructs are valid at
any
givenpoint,
it
allowsthe
programmer
to
insert
alanguage
constructonly
whereit
is
syntactically
correct.For
example,
instead
ofentering
the
character sequence
for
an"IF"
statement
in
Pascal,
the
programmer calls on
the
template
IF.
The
resultis
the
inser
tion
cfthis
template:if
<expression>
then
<statement>
at
the
current programposition,
providedthat
this
conthis
template
containsthe
templates
for
expression andstatement.
The
current program positionis
then
advancedto
<expression>
sothat
it
canbe
similarly
expanded.Internally
ALOE
representsthe
program as a"syntax
tree".
Each
template
correspondsto
a node of a certaintype
in
the
tree.
The
holes
ofthe
template
arethe
offspring
ofthe
node.They
willbe
filled
in
with subtreesrepresenting
the
expansion ofthose
holes.
Thus,
the
programmeractually
constructs and manipulates a program
tree
without necessarily
being
aware ofit.
In
orderto
makethe
tree
readable
by
the
user,
ALOE
uses an unparserthat
translates
the
syntax
tree
into
ahuman-readable
text.
As
part ofits
task,
this
unparserdoes
the
pretty-printing
ofthe
program.This
unparsing
processis
driven
by
"unparsing
specifiedin
the
grammaticaldescription
that
describe
the
mapping
from
the
abstract syntaxto
the
concrete representation ofeach
language
construct.The
programmerinteracts
withALOE
through
language
commands
(constructive
commands) andediting
commands.Language
commands are usedto
insert
newtemplates
(e.g.,
the
callfor
anIF
template).
Editing
commands are usedto
manipulate
the
programtree
(e.g.,
delete
a subtree).ALOE
is
extensible:Extended
commands canbe
defined
to
14
incorporate
language
specificfunctionality.
Additionally,
action routines can
be
specifiedin
the
grammaticaldescrip
tion
ofthe
language
to
be
invoked
by
ALOE
in
certain situations.
They
allowthe
implementor
of anALOE
to
performsemantic
checking,
automatically
generate programpieces,
attach or review
information
storedin
the
tree,
etc.In
orderto
createthe
ALOE
for
alanguage
like
Pascal,
a
description
ofthe
grammarfor
the
language
to
be
editedmust
be
created.The
generatoris
anALOE
instantiation
(ALOEGEN)
which can generateany
syntaxdirected
editorfor
any
language
according
to
i.ts
grammar.ALOEGEN
is
itself
aninteractive
syntax-directed editorthat
is
usedto
generatea grammatical
description
for
alanguage.
2.2.
Screen
Organization
In
every
ALOE
instantiation,
the
screenis
divided
into
different
windows.The
maintree
is
displayed
using
a rootwindow.
Different
contents ofthe
tree
canbe
displayed
using
different
overlaying
windowsfrom
a pool oftree
windows.
Clipped
trees
aredisplayed
in
clipped windows normally
placedin
the
bottom
half
ofthe
tree
windows.Errors
reported
from
action routines aredisplayed
in
an error window
that
normally
overlaysthe
tree
windows.Help
informationis
displayed
in
a"help"
window nor
is
a oneline
windownormally
placed nearthe
bottom
ofthe
screen and
it
is
displayed
only
if
the
status modeis
on.2-3.
The
ALOE
Generator
The
ALOE
generator produces anediting
environmentbased
on a grammaticaldescription
of alanguage.
Every
ALOE
generated
has
three
major components:(1)
The
ALOE
kernel
commonto
allALOEs.
It
understands andmanipulates
the
internal
representation oftrees
andis
a
language
independent.
It
provides an extensive set ofediting
commands.It
also providesthe
default
imple
mentation of
the
set of environment-specificfunctions.
(2)
The
syntactictables
producedfrom
the
grammaticaldescription
ofthe
language.
They
provideALOE's
syntactic
knowledge
ofthe
language,
as well asthe
unparsing
knowledge
through
the
unparsing
schemes.(3)
Implementation
of actionroutines,
extended commandsand environment-specific routines.
These
providethe
language
specificbehavior
of anALOE
that
is
beyond
its
syntactic capabilities.2.3.1.
The
grammarDescription
The
basic
structure of a grammarhas
aROOT
node with aoff-16
strings as
follows:
language
name,
language
root,
terminal
productions,
nonterminal productions and classes.2.3.1.1.
Language
Name
Each
grammarhas
alanguage
name associated withit.
This
is
usedto
mark alltree
files
(programs)
that
arecreated and stored with an
ALOE.
The
ALOE
checksthis
nameand
does
not attemptto
edit atree
createdby
anALOE
for
another
language.
The
name ofthe
file
is
the
language
namewith
the
extension".tr".
For
example,
if
the
language
nameis
"PASCAL",
the
name ofthe
file
willbe
"PASCAL.tr".
The
language
nameis
displayed
onthe
screen as:Language
name:'PASCAL'
2.3.1.2.
Language
Root
One
ofthe
non-terminals mustbe
designated
asthe
rootof
the
grammar or start symbol.ALOEGEN
automatically
creates a non-terminal with
the
name ofthe
root operatorafter
the
root operatoris
created.The
rootis
displayed
to
the
user as:2.3.1.3.
Terminal
Productions
The
entirelist
ofterminal
operatorsis
kept
in
aseparate window.
At
the
top
level
window,
the
terminal
window
is
marked as:>terminals<
While
the
cursoris
positioned atthis
node,
the
windowmust
be
enteredby
the
._IN(<cursor-down>)
commandto
expand
the
windowinto
the
terminal
window.The
userleaves
the
terminal
windowby
getting
to
the
root ofthe
window andentering
._OUT(<cursor-up> )
.The
terminal
productionis
displayed
onthe
screen as:$opident
$tertype
|
action:faction
I
synonym:^synonym
'lex:
$lex!
($schexp)
^scheme
$unparse
;
$terminal
i L
The
character '!
'
is
usedto
separatethe
different
parts of
the
production.The
description
of eachterminal
symbol consists of six parts as
follows:
(l)
Terminal
Operator
Name
($opident).
The
operator
names18
generates
the
table
<language-name>.inf
op
consisting
of"#define'*s
for
each operatorin
the
language.
Any
non-alphanumeric
characters(except
space andtab)
willbe
replacedby
an underscore.(2)
Terminal
Type
($tertype).
There
are severalterminal
types
which are usedto
specify
the
type
ofthe
termi
nal
operator,
for
example,
variabletype,
realtype,
character
type,
etc.As
the
terminal
type
is
entered,
if
the
lexical
routineis
not yetfilled
in,
adefault
is
automatically
supplied.The
types
andthe
defaults
are
listed
in
Appendix
D.
(3)
Action
Routines
($action).
These
are usedto
expand onthe
syntactic capabilities ofthe
basic
ALOE,
including
semantic
checking,
automatic generation of programpieces,
and communication with other parts ofthe
editing
environment such as accesscontrol,
code generation,
etc.If
there
is
no actionroutine,
then
"<none>"
is
displayed.
The
action routines are usefulin
the
integrated
programming
environment such asGAN-DALF.
(4)
Synonym
(^synonym).
Synonyms
resolve conflictsbetween
variable names and constructive commands.
For
example,
at
this
point containsboth
"IDENT"
for
type
identifier
and
'INTEGER'
for
integer
type.
If
the
user wantsthis
type
to
be
integer
andthen
types
in
"in",
the
ALOE
will accept
the
word'in
"as an
identifier
instead
ofan
integer
type,
whichis
wrong.If
the
INTEGER
commandhas
asynonym,
such as~I,
the
synonymmay
be
usedto
unambiguously
identify
the
type
"integer".
If
there
is
no synonym
for
aterminal
operator,
then
"<none>"
is
displayed.
(5)
Lexical
Routines.
Lexical
Routines
are associated withthe
terminal
operator sothat
simplelexical
analysisof
this
operator canbe
performedaccording
to
the
lex
ical
routine givenin
the
grammar.For
example,
the
terminal
operator"IDENTIFIER"
has
alexical
variableso
that
every
time
the
user enters anidentifier
name,
ALOE
willdetermine
whether or notit
is
alegal
iden
tifier
name.The
default
lexical
routines associatedwith each
terminal
type
arelisted
in
Appendix
D.
(6)
Unparsing
Schemes
($unparse).
The
definition
ofthe
unparsing
schemes consists of alist
of pairs.The
first
element of each pair($schexp)
is
alist
ofnumbers and ranges
indicating
whichunparsing
schemesare
defined
by
the
pair;the
second element($scheme)
of
the
pairis
astring
that
is
the
actualunparsing
20
"syntactic
requiredby
the
language
(like
the
parentheses around
the
input
list
of a"read"
state
ment)
andthe
way
the
operatoris
to
be
formatted.
Lists
ofunparsing
schemes arein
Appendix
D.
For
example,
the
unparsing
schemes ofthe
comment operatorare:
(0)
"<*>{
Oc
}"
(1)
"0>(*
0c
*)"
The
G>
indicates
that
four
spaces areto
be
inserted,
and
the
0c
indicates
that
a characterstring
is
to
be
inserted.
So
these
schemes meanthat
whenthe
unparsing
schemeis
zerothe
commenttemplate
is
unparsedby
inserting
four
spacesbefore
the
commentdelimiter
"{",
and
then
followed
by
the
content ofthe
comment and "}"If
the
unparsing
schemeis
one,
the
commentdelimiters
will
be
'(*'
and
'*)
'
instead.
Also,
the
first
elementof
the
pair canbe
in
format
(0:2,5)
or(0,1,2,5)
whichmeans
that
the
unparsing
schemes0,1,2,
and5
have
the
same
format
of actualunparsing
scheme.The
particularunparsing
schememay
be
selected withthe
.sccommand,
given as .sc
<number>
(.scheme
<number>)
command whileediting,
orby
using
the
-u<number> commandline
switchto
setthe
unparsing
scheme usedby
ALOE
to
scheme<number>
before
starting
anediting
session.For
pcedit -u2
<filename>
calls
the
Pascal
syntax-directed
editor and selects anunparsing
scheme
two.
Scheme
zerois
the
default.
2.3.1.4.
N
on-terminal
Productions
As
withterminals,
the
list
of non-terminalsis
kept
in
a separate window
displayed
as:>nonterminal<
Non-terminal
productions aredisplayed
onthe
screen as:$opident
=$clspec
!
action:faction
' synonym:$synonym
i
precedence:
^precedence
!
Non-filenode
'$unparse
;
^nonterminal
I
The
description
of each non-terminal symbol consists ofseven parts as
follows:
(1)
Non-terminal
Operator
Name
($opident).
Same
asfor
ter
minal operator name.
(2)
Offspring
($clspec).
There
aretwo
types
of nonterminals:
fixed
arity
nodes(nodes
withfixed
numbers22
variable numbers of offsprings).
Fix
arity
nodes aredescribed
by listing
the
set ofoffspring
afterthe
name of
the
operator.Each
offspring
is
representedby
the
name ofthe
classthat
specifiesthe
legal
operators
for
that
field.
Variable
arity
nodes aredescribed
by
listing
the
name ofthe
classfrom
whichelements of
the
list
mustbe
selected.This
is
displayed
by
enclosing
the
class namein
anglebrack
ets.
Classes
referredto
asoffspring
of non-terminaloperators are
automatically
generatedif
they
do
notalready
exist.Only
the
classidentifier
is
supplied.The
list
of operatorin
the
class mustbe
filled
in
by
the
user.(3)
Action
routines.Same
asfor
the
action routine ofthe
terminal
production.(4)
Synonym.
Same
asfor
the
synonym ofthe
terminal
production.
(5)
Precedence.
Non-terminal
operatorsoptionally
have
precedence values associated with
them.
These
are usedto
automatically
parenthesize expressions while unparsing.
The
precedence values areonly
necessary
for
display
purposes.Expressions,
eventhough
displayed
in
infix
notation,
are enteredin
prefix order withthe
the
expression(a+b)*c,
one would enterits
constituents
in
the
following
order: * + ab
c.The
parentheses are
automatically
displayed
as requiredby
the
operator precedence.If
no precedenceis
assigned,
"<none>"
is
displayed.
(6)
Filenode.
This
is
usedto
determine
whether or notthe
non-terminal
has
its
subtree storedin
a separatefile
(for
storage andcheckpointing
purposes).The
rootoperator
is
automatically
made afilenode
by
the
generator.
The default
is
a non-filenode.
2.3.1.5.
Classes
Classes
arelists
oftypes
ofoffsprings,
whichdefine
the
set ofterminal
and nonterminal operatorsthat
arelegal
at
that
point.These
arelisted
asthe
constructive commandswhen
the
novice modeis
used.Classes
aredisplayed
in
aseparate window marked as:
>classes<
24
T
$clident
=!
$
op id
ent;
!
$class
!
i
j.
To
enterto
or exitfrom
the
class,
press<cursor-down>
and<cursor-up>
respectively.2.4.
Building
anALOE
for
Pascal
STEP
1
Set
the
shell variableTERM
to
"vtl00"
and
the
environment variable
EDITOR
to
"ex"
so
that
ALOE
will use exas
its
text
editorto
edit aconstant,
long
constant ortext
node.Add
an environment variableGANDALFLIB
which points
to
the
directory
which containstwo
files
named
libcmu.a
andlibgandalf
.a .For
example,
setenv
GANDALFLIB
/acct
/misc/gandalf
/lib
setenv
EDITOR
/usr/ucb/ex
set noglob; eval
%tset
-s -QSTEP
2
Create
the
grammarfor
Pascal
using
the
ALOE
generator(ALOEGEN).
This
is
in
/acct/misc /gandalf
/bin
directory.
The
tree
file
the
ALOEGEN saves willhave
the
namegiven
by
the
userto
ALOEGEN
withthe
extensionadded.
The
following
showthe
sequence ofthe
ALOEGEN
processfor
Pascal.
After
invoking
"ALOEGEN",
the
systemprompts
for
the
program name whichis
alanguage
name:The
implementor
then
types
in
the
<language
name);for
example,
PASCAL The
tree
file
named"PASCAL.tr"
is
created:Language
name:Language
root:>terminals<
>nonterminals<
>classes<
>
26
The
implementor
then
fills
in
allthe
unexpanded nodes:"$opident",
"^terminals" ,"^nonterminals"
and
"$classes".
The
cursoris
now positioned atthe
"$opident"
which stands
for
the
operatoridentifier.
Typing
in
'program'
causes
the
"$opident"
to
be
replaced.Language
name:Language
root:>terminals<
>nonterminals<
>classes<
"PASCAL"
i$opident"i
>program
Language
name:Language
root:>terminals<
>nonterminals<
>classes<
Now,
movethe
cursorto
">terminals<"
(by
using
<cursor-right>)
and enterthe
separate windowthat
containsthe
ter
minal operators
(by
using
<cursor-down>)
and enterthe
ter
minal symbol
descriptions
(described
in
section2.3.1.3).
An
example ofthe
terminal
operator windowis:
t
i
/*
terminal
operators*/
REAL
={s}
!
action:<none>
}
synonym:<none>
j
lex:
<none>
i
(0)
'real*;
REALCONST
={r}
!
action:<none>
I
synonym:<none>
i
lex:
lexreal
(0)
"0c";
i
INTCONST
{i}
|
action:<none>
i
synonym:<none>
!
lex:
lexinteger
(0)
"0c";
i
{s},
{r}
and{i}
areterminal
nodetypes
($tertype).
{s}
standsfor
a statictype
that
representsthe
concretepiece of
the
language,
{r}
for
a real nodethat
contains areal value and
{i}
for
aninteger
nodethat
contains aninteger
value.For
moredescription
ofunparsing
schemes,
28
To
leave
the
window,
use<cursor-up>.
Next,
movethe
cursorto
the
non-terminal operator and repeatthis
process.Some
of
the
example ofthe
nonterminal operators ofPASCAL
are:/* non-terminal operators
*/
PROGRAM
=prog
I
action:<none>
i
synonym:<none>
I
precedence:
<none>
!
Filenode
'(0)
"01";
PROG
=pgmhead
block
!
action:<none>
!
synonym:<none>
!
precedence:<none>
!
Non-filenode
1
(0)
'0n0102.'j
PGMHEAD
=ident
idents
!
action:<none>
'synonym:
<none>
!
precedence:
<none>
i
Non-filenode
!
(0)
"program
01
(02);
";
t*
The
root operatorPROGRAM
contains oneoffspring
prog
that
is
afixed
arity
node,
wherePROG'
contains
two
offsprings of
"pgmhead"
and
"block".
If
the
offspring
is
avariable
arity
node,
this
willbe
displayed
by
enclosing
the
class name
in
anglebrackets.
01
and02
indicates
the
unparsing
ofthe
first
and secondoffspring
respectively,
output.
Next
are some ofthe
PASCAL
classes./* classes
*/
prog
=PROG
COMPROG;
pgmhead =
PGMHEAD
J
stat =
PROCCALLNOPAR
READ
READLN
WRITE
WRITELN
REPEAT
CASE
WITH
WHILE
TFOR
DWFOR
GOTO
IF
IFELSE
COMPOUNDSTAT
EMPTY
PROCCALL
LABELSTAT
REWRITE
RESET
ASSIGN
COMSTAT
STATCOM
;
exp
=IDENTIFIER
NE
PLUS
MINUS
MULT
DIVIDE
EQ
MOD
GT
GE
DIV
LT
LE
NOT
NEGATE
AND
OR
IN
INTCONST
INDEXVAR
FIELDESIGN
PCINTERVAR
REALCONST
SET
EOF
EOLN
PROCCALL
STRING
NIL
J
On
exitfrom
ALOEGEN,
two
sets oftables
willbe
generated.
These
arePASCALtbl.c
(the
table
ofdata
structuresin
C)
andPASCAI.infop
(a
set ofdefinitions
whichindicate
the
values assignedto
the
operators cfthe
language).
If
the
tables
have
been
generatedappropriately,
the
user willbe
asked whether or notthe
tables
shouldbe
compiled.Before
compiling
these
tables,
the
include
statementin
30
"usr/gandalf
/include/
da tadefs
to
"acct/misc/gandalf/include/datadefs
by
editing
this
file.
If
there
are classes which containmany
operators,
it
mightcause a
'line
too
long'
problem when
the
implementor
tries
to
editthis
file.,
If
this
happens,
the
implementor
has
to
shorten
those
lines
by
using
anotherterminal
andrunning
through
a programthat
will shorten allthe
lines
that
aretoo
long
in
the
file
"PASCALtbl.c"
before
changing
the
"include"
statement.
Then
the
implementor
willbe
ableto
edit
the
'"include"
statement as mention
before.
Then
the
tables
shouldbe
compiled,
andthe
third
table
PASCALtbl.o
will
be
generated.STEP
3
ALOE
canbe
extendedto
have
language
orenvironment-specific
behavior.
These
arethree
different
kinds
ofmechanisms
to
providethese
extensions:implementation
of action
routines,
extended commands and environmentspecific routines.
Environment-Specific
Routines
The
ALOE
system provides avariety
of standard routineswhich can
be
replaced.When
a particularALOE
is
going
to
be
generated, some or all cfthe
routines canbe
replacedwith environment-specific
implementations.
The
objectfiles
archive
in
'/acct/misc
/gandalf
/lib/DEFAULT
.a '.
If
the
userdecides
to
modify
some ofthe
default
routines,
only
the
newroutines must
be
written(using
the
same procedure names asthe
default
routines)
and compiled.The
objectfiles
(.0)
for
these
routines mustbe
givento
the
loader
before
the
DEFAULT.
a archivefile
name-The
loader
will not pick routines
from
DEFAULT.
aif
they
have
already
been
defined.
The
environment
specific-routine
for
alexical
variable shownin
Appendix
A
was writtento
replacethe
default
'lexvariable'
routine.
Action
Routines
Action
routines are specified as part ofthe
grammatical
description
ofthe
language.
An
action routineis
calledwhen an
instance
ofthe
node associated withit
is
created,
deleted,
exited,
etc.When
an action callis
made,
two
parameters are always passed:
the
node wherethe
actionis
occurring
andthe
type
ofthe
action call.The
procedurespecification
for
C
proceduresthat
are action routinesshould
be:
actionROUTINE
(thisnode,
actkind)
struct
tnode
*thisnode;
int
actkind?/*
implementation
code*/
32
The
valuesfor
the
enumeratedtype
representedby
"actkind"
are
included
in
the
ALOE
library
specificationfile
"acct/misc/gandalf/include/ALOELIB.h".
There
areseveral
types
of actioncalls,
for
example,
DELETE,
INSERT,
EDIT,
NEST,
TDELFTE,
UNNEST,
RENAME
andTRANSFORM.
For
moredetails
ofaction,
see[MedilbJ
.For
this
thesis,
some action routines such aschecking
for
anempty
characterstring,
anidentifier
whichis
areserved
word,
an undefined variable ortype
areincluded
in
the
grammaticaldescription.
Intended
Commands
Extended
commands are addedtc
the
basic
set cfediting
commands
to
expand onthe
capabilities of anALOE.
Their
user
interface
is
the
same asthat
ofediting
commands.The
names and synonyms of extended commands should
be
chosen sothat
they
do
not conflict withthose
cfthe
basic
editing
commands.
STEP4
Linking
anALOE.
The
commandfor
linking
anALOE
(ldaloe)
is
ldaloe
<language
name>[<.o
and .afiles>]
for
example,where
"PASCAL"
is
the
language
name,
lexvar.o
is
anenvironment-specific
routine,
and aSTR.c and alDENT.o areaction routines.
This
will generate anALOE
in
the
file
named
<language
name>.aloe.In
this
casethe
file
namewould
be
PASCAL.
aloe.2.5.
Internal
Representation
ofSy_ntax
Tree
Each
template
correspondsto
a node of a certaintype
in
the
tree.
The
unexpandednodes,
referredto
as metanodes,
ofthe
template
arethe
offspring
ofthat
node.They
will
be
filled
in
with subtreesrepresenting
their
expansion.
Figure
2.1
showsthe
display
before
the
constructionof an
if
statementin
Pascal
andthe
display
resulting
from
its
creation.Figure
2.2
showsthe
editing
sessionfor
the
construction of an
"if"
statement.
The
program cursoris
indicated
by
a rectanglein
the
figure.
Note
that
the
cursor
is
an area cursorthat
highlights
the
entiretextual
expansion of
the
current subtree.A
single character cursorwould
be
ambiguousin
a structure contextbecause
the
cursorwould
be
placed atthe
first
character ofthe
first
elementof
the
list
andit
would notbe
clear whetherit
refersto
program add
(input,
output)
var x,y;begin
l$stat
end.
program add
(
input,
output);
var x,y;begin
if
i$exp{
then
istat;
$stat
end.XJ XI QJ QJ
CD
-t-'cf > +J
X* 0J QJ QJ O 10
S3 O,! -o ID
EH
Ai Pi ><; <o -=>
c h
X K IS1 ^ v ;> p, P. QJ QJ QJ QJ t-<0 XJ X O- XI ->
-*SV txQtl) 0) <m P+= Pi f-iT3 p G -t/> x>j CI K II O QJ .,_( t/) v <L> IS) (0 tO Ol 0J IS) G QJ *h-rH P >^IS>-i-l+J CO i-l 1 p, f-. 0-tf> QJ X \ IT
?* I *
K 3i-i p(h (U
Td^-oT-Ol J QJ O<->
-U COC U C O
<J I m-i/v i-i m oji/1 ajoi/i(d fi QJlSlf-. ti QJ +J O *1
tHO jdOv-l >3 Xi C QJ 3 O
(^ -M Eh PCm ^ U P N U1 OP
/\
/\ /\ +-> /"\
*-> ->
<C ->
qj I fd <TJ J (0
aj I 4-> ?-> IS) *J
Pi 1 1/1 IS) V /\ IS)
+* 1 ^\/
^V / P ^V
1 n <M <M !-i M
K 1 iH i-l H QJ 1-1
<0 1 \/\ \..-t \ r-l/V \
*> 1 P.
<0 <0 r-l
C I X 3 3 (0
>> 1 d) a* CT<
3.
CO 1 V d) a\
1"! a4 ai c QJ .G +J c QJ P XI X +J C aj
QJ -sv a
X) P 0)
->=, i +J II X X3
<0 I .> . * QJ *>. +J
-rH 1 4^ -*
-u> *J 03 A 1 P<0 P< <0 >*?* IS) 1 X -t- Xp +> IS) T* 1 0) IS) aj is) II IS) II
-m-O 1 -t/)sv -t/V-l/V </)
-X X
M M V. Sm
Ih 1 i-l 1-) H hH
0J I IS) 1 3 1 1 > 1 ^> 1 1 1 xJ 1 QJ 1 P 1 >* 1 <H
E-i 1 i-l II K >
CHAPTER
3
USER
INTERFACE
An
ALOE
canbe
visualized as a constructive editorthat
understands
the
syntax of alanguage.
Language
constructssuch as
variables,
operators,
expression,
statements,
declarations,
etc.,
canbe
added,
modified and removed.The
user communicates with
ALOE
in
terms
cfthese
language
constructs.
The
user constructshis
programby
inserting
tem
plates
representing
different
language
constructs andthen
filling
in
the
unexpanded parts ofthose
templates
withother
templates.
Internally,
ALOE
representsthe
program as a syntaxtree.
Each
template
correspondsto
a node of a certaintype
in
the
tree.
The
unexpanded nodes ofthe
template
arethe
offspring
ofthe
node.These
unexpanded nodes are referredto
as meta nodes.They
willbe
filled
in
with subtreesrepresenting
their
expansion.3.1.
Editing
Commands
Editing
and constructive commands are specifiedusing
different
naming
conventions.Editing
Commands
are usedto
manipulate
the
programtree;
they
are prefixed with a period(".'").
Constructive
commands are usedto
insert
newplates;
they
are prefixed with a comma(",").
Only
a sufficient number of
letters
to
uniquely
identify
the
choice mustbe
given.There
is
no needtc
prefixthe
constructive command with
",
'
if
there
are no classes wherethe
lexical
routines
for
multipleterminal
operators couldbe
recognizedsimultaneously.
An
appropriate synonym carbe
usedto
mereeasily
enterthe
terminals
(nodes
which representleaves
ofthe
tree)
withoutcorresponding
lexical
routines.Command
lines
areterminated
by
ablank
or<CR>.
Several
commandscan
be
typed
in
a singleline
ofinput;
ALOE
willapply
eachcommand
in
sequence,
and will updatethe
display
afterthe
application of
the
last
command.Drawbacks
ofthis
approachare
especially
evident whenthere
are errors causedby
acommand,
andthe
rest ofthe
commands mustbe
discarded
orapplied out of context.
A
<CR>
is
interpreted
as a command when applied at ameta node
(unexpanded
node);otherwise,
it
has
no meaning.If
the
meta nodeis
an element of alist,
the
<CR>
indicates
that
the
userhas
finished
entering
the
list
elements.At
this
pointthe
ueta nodeis
deleted,
andthe
cursoris
movedto
the
next meta node.If
the
class ofthe
current meta nodecontains an
EMPTY
operator,
then
the
EMPTY
operatoris
applied.
The
implementation
of cursorhighlighting
is
optimal38
this
thesis,
anALOE
is
to
be
generatedto
run on aGIGI
terminal,
so aninappropriate
line
update or cursorhighlight
might occur.3.2.
ALOE
Modes
There
arefive
mode settingsin
anALOE:
(1)
Cursor-follows
modeindicates
whetherthe
cursorfol
lows
the
tree
structure orthe
currentunparsing
scheme.
The
default