• No results found

Small BASIC Expressions

In document The Art Of Java pdf (Page 78-84)

As they apply to the small BASIC interpreter developed in this chapter, expressions are comprised of the following items:

Integers

The operators + – / * ^ = ( ) < > >= <= <> Variables

In Small BASIC, the^indicates exponentiation. The=is used for both assignments and for equality. However, relative to BASIC expressions, it is only an operator when used in a relational expression. (In standard BASIC, assignment is a statement and not an operation.) Not equal is denoted as< >. These items can be combined in expressions according to the rules of algebra. Here are some examples:

7 – 8

(100 – 5) * 14/6 a + b – c 10 ^ 5 A < B

The precedence of the operators is shown here:

Highest ( ) unary + – ^ * / + – Lowest < > <= >= <> =

Operators of equal precedence evaluate from left to right. Small BASIC makes the following assumptions:

All variables are single letters; this means that 26 variables, the letters A through Z, are available for use.

The variables are not case sensitive; 'a' and 'A' will be treated as the same variable. All numbers aredoubles.

No string variables are supported, although quoted string constants can be used for writing messages to the screen.

These assumptions are built in to the parser.

Small BASIC Tokens

At the core of the Small BASIC parser is thegetToken( )method. This method is an expanded version of the one shown in Chapter 2. The changes allow it to tokenize not just numeric expressions, but also other elements of the Small BASIC language, such as keywords and strings.

In Small BASIC, each keyword token has two formats: external and internal. The external format is the text form that you use when writing a program. For example, "PRINT" is the external form of the PRINT keyword. Although it is possible for an interpreter to be designed in such a way that each token is used in its external string form, this is seldom (if ever) done because it is inefficient. Instead, Small BASIC operates on the internal format of a token, which is simply an integer value. For example, the PRINT command is represented by 1; the INPUT command by 2; and so on. The advantage of the internal representation is that much faster code can be written using integers rather than strings. It is the job ofgetToken( )to convert the token from its external format into its internal format.

C h a p t e r 3 : I m p l e m e n t i n g L a n g u a g e I n t e r p r e t e r s i n J a v a

6 5

AppDev TIGHT/ The Art of Java / Schildt/Holmes / 222971-3 / Chapter 3

P:\010Comp\ApDev\971-3\ch03.vp Monday, July 07, 2003 10:03:20 AM

Color profile: Generic CMYK printer profile Composite Default screen

The Small BASICgetToken( )method is shown here. It progresses through the program one character at a time.

// Obtain the next token.

private void getToken() throws InterpreterException {

char ch;

tokType = NONE; token = "";

kwToken = UNKNCOM;

// Check for end of program. if(progIdx == prog.length) {

token = EOP; return; }

// Skip over white space. while(progIdx < prog.length &&

isSpaceOrTab(prog[progIdx])) progIdx++; // Trailing whitespace ends program.

if(progIdx == prog.length) { token = EOP; tokType = DELIMITER; return; } if(prog[progIdx] == '\r') { // handle crlf progIdx += 2; kwToken = EOL; token = "\r\n"; return; }

// Check for relational operator. ch = prog[progIdx];

if(ch == '<' || ch == '>') {

if(progIdx+1 == prog.length) handleErr(SYNTAX); switch(ch) {

case '<':

if(prog[progIdx+1] == '>') { progIdx += 2;;

token = String.valueOf(NE); } else if(prog[progIdx+1] == '=') { progIdx += 2; token = String.valueOf(LE); } else { progIdx++; token = "<"; } break; case '>': if(prog[progIdx+1] == '=') { progIdx += 2;; token = String.valueOf(GE); } else { progIdx++; token = ">"; } break; } tokType = DELIMITER; return; } if(isDelim(prog[progIdx])) { // Is an operator. token += prog[progIdx]; progIdx++; tokType = DELIMITER; } else if(Character.isLetter(prog[progIdx])) { // Is variable or keyword. while(!isDelim(prog[progIdx])) { token += prog[progIdx]; progIdx++;

if(progIdx >= prog.length) break; }

kwToken = lookUp(token);

if(kwToken==UNKNCOM) tokType = VARIABLE; else tokType = COMMAND;

}

else if(Character.isDigit(prog[progIdx])) {

C h a p t e r 3 : I m p l e m e n t i n g L a n g u a g e I n t e r p r e t e r s i n J a v a

6 7

AppDev TIGHT/ The Art of Java / Schildt/Holmes / 222971-3 / Chapter 3

P:\010Comp\ApDev\971-3\ch03.vp Monday, July 07, 2003 10:03:20 AM

Color profile: Generic CMYK printer profile Composite Default screen

// Is a number.

while(!isDelim(prog[progIdx])) { token += prog[progIdx];

progIdx++;

if(progIdx >= prog.length) break; } tokType = NUMBER; } else if(prog[progIdx] == '"') { // Is a quoted string. progIdx++; ch = prog[progIdx]; while(ch !='"' && ch != '\r') { token += ch; progIdx++; ch = prog[progIdx]; } if(ch == '\r') handleErr(MISSINGQUOTE); progIdx++; tokType = QUOTEDSTR; }

else { // unknown character terminates program token = EOP;

return; }

}

SBasicdefines the following instance variables that are used extensively bygetToken( ) and the rest of the interpreter code:

private char[] prog; // refers to program array private int progIdx; // current index into program private String token; // holds current token private int tokType; // holds token's type

private int kwToken; // internal representation of a keyword

The program is stored in a character array that is referred to byprog. The specific location at which the interpreter is operating is stored inprogIdx. The string version of the token is held intoken. The token type is stored intokType. The internal representation of a token representing a keyword is stored inkwToken.

The Small BASIC parser recognizes five token types:DELIMITER,VARIABLE, NUMBER,COMMAND, andQUOTESTR.DELIMITERis used both for operators

and parentheses.VARIABLEis used when a variable is encountered.NUMBERis for numbers. TheCOMMANDtype is assigned when a BASIC keyword is found. Tokens of typeCOMMANDrequire that an action be taken by the interpreter. TypeQUOTESTRis for quoted strings.

Look closely atgetToken( ). If the end of the program has been reached, thentokenis assignedEOPand the method returns. Otherwise, leading spaces are skipped with the help of the methodisSpaceOrTab( ), which returns true if its argument is a space or tab. It is not possible to use Java'sCharacter.isWhitespace( )method (which returns true for any whitespace character) for this determination because BASIC recognizes the newline character as a terminator. Thus, for Small BASIC, white space is limited to just spaces and tabs. Assuming that trailing spaces don't end the program, once the spaces have been skipped,prog[progIdx]will be referring to either a number, a variable, a keyword, a carriage-return/linefeed sequence, an operator, or a quoted string.

If the next character is a carriage return,kwTokenis set equal toEOL, a carriage return/line feed sequence is stored intoken, andDELIMITERis put intotokType.

Otherwise,getToken( )checks for relational operators, which might be two-character operators, such as<=.getToken( )converts two-character operators into their internal, one- character representation. The valuesNE,GE, andLEare defined asfinalvalues withinSBasic. Next,getToken( )checks for the other operators. If any type of operator is found, it is returned as a string intokenand the type ofDELIMITERis placed intokType.

If the next character is not an operator,getToken( )checks to see if it is a letter. If it is, then the token will be either a variable, such as A or X, or a keyword, such asPRINT. The lookUp( )method checks to see if it is a keyword. If it is,lookUp( )returns the appropriate internal representation of the keyword. If it is not a keyword, then the token is assumed to be a variable.

Otherwise, if the next character is a digit, thengetToken( )reads a number. If, instead, the next character is a quotation mark, then a quoted string is read. Finally, if the next character is none of the above, it is assumed that the end of the expression has been reached.

The rest of the parser works essentially the same as it did in Chapter 2 with the exception ofevalExp1( ). In Chapter 2evalExp1( )was used to handle the assignment operator. However, in traditional BASIC, assignment is a statement, not an operation. Therefore,evalExp1( )is not used for assignment when parsing expressions found in Small BASIC programs. Instead, it is used to evaluate the relational operators. If you use the interpreter to experiment with other types of languages, then you may need to add a method calledevalExp0( ), which would be used to handle assignment as an operator.

One other important difference between the parser in Chapter 2 and the one used here is that in Chapter 2, the end of the string that held the expression indicated the end of the expression. In this version, the end of the expression is signaled by the end of the line or anything else that is not a valid part of an expression, such as a keyword.

The Small BASIC parser recognizes only the variables A through Z. Although it will accept long variable names, only the first letter is significant. You can modify it to enforce single-letter variable names if you like.

C h a p t e r 3 : I m p l e m e n t i n g L a n g u a g e I n t e r p r e t e r s i n J a v a

6 9

AppDev TIGHT/ The Art of Java / Schildt/Holmes / 222971-3 / Chapter 3

P:\010Comp\ApDev\971-3\ch03.vp Monday, July 07, 2003 10:03:20 AM

Color profile: Generic CMYK printer profile Composite Default screen

In document The Art Of Java pdf (Page 78-84)